Show HN: Index and search *all* your documents https://ift.tt/GPVhiUl

Show HN: Index and search *all* your documents Hey HN! I've build a simple tool to index and search your documents. This uses two great open source libraries: apache tika (for extracting content from docs) and apache lucene (for searching). It's been built with kotlin ktor as a web framework. You can index all kind of files (i.e doc, docx, xls, ppt, pdf, txt, html even ORC pdfs) and then search them using very advanced queries like "always contain X", "never contain X", "X near Y", wildcard search, proper stemming support etc. We're using it on my work where we have hundreds of thousands of doc/docx/pdf files and it works flawlessly! https://ift.tt/RwzSxZa August 11, 2024 at 12:14AM

No comments:

Show HN: Oodle – serverless, fully-managed, drop-in replacement for Prometheus https://ift.tt/8dsQwyA

Show HN: Oodle – serverless, fully-managed, drop-in replacement for Prometheus Hello HN! My co-founder, Vijay and I are excited to open up O...