PrintPrint
Solución
Producto
Implantación
The Apache Software Foundation
Powered by Lucene
Powered by the Spring Framework
Hoja de ruta

La página de Hoja De Ruta contribuye a la transparencia y a la comunicación entre los miembros de la comunidad, incluyendo usuarios potenciales usuarios and desarrolladores. 

Si eres un usuario (o potencial usuario) de Kneobase, la Hoja de Ruta revela la actual estrategia y el camino a zeta clear consumer information does it work new azada game seguir. Esto tiene la misma importancia si estás interesado en contribuir al desarrollo en marcha.

 

Summary of current features (v1.3) and the roadmap for planned releases v1.5 & v2.

Features v1.3

  • Allows configuration of different content sources.
  • Centralized search capability for multiple content sources
  • Indexing and searching in Spanish, English, Portuguese, German, French and Nederlands.
  • Language auto-detection at indexing time (configurable).
  • Configuration of multiple search hierarchies to make precise disctinctions of content by type, location, etc.
  • Support for indexing of files embedded in compressed archives:
    • ZIP
    • TAR
    • GZIP
    • TGZ
    • Viewer Servlet for files inside zip archives
  • Search for multiple file types (like the MIME-types configuration of a Web-server)
  • Full and incremental indexing, through a cron-like configuration
  • Web based administration point.
  • API for searching.
  • SPI for adding new content sources.
  • Advance textual queries (fuzzy, pattern based , etc.). Pattern based search is much like DOS wildcards: * and ? .
  • Concurrent indexing and searching.
  • One index per content source. Update, enable/disable content sources independently.
  • SOAP server for access from non-Java based apps: .NET, PHP, etc.
  • Multiple file types support (MIME-TYPES)
    • PDF
    • MS Word
    • MS Powerpoint
    • MS Excel
    • HTML
    • XML
    • OpenOffice
    • … (see configuration files for the complete list)
  • Drivers: FS, FTP, RDBMS (phpcollab, mantis, tutos), OpenCMS, Tree (generic).
  • Out-of-the-box way of indexing standard meta-data, e.g.: "Title".
  • Search exceptions on Kneobase Webservice mapped to SOAP fault codes = improved error reports in the client side.
  • Out-of-the-box installation with a windows auto-installable package

Features v1.5

  • Web crawler integration (Nutch? Spindle?)
  • On-demand indexing
  • Implement standard XML schema for search results (Google Data and/or Open.search)
  • Spell checker (~"Did you mean this..."~~)
  • Integration/collaboration with other OSS software such as: Solr, Compass
  • Upgrade Lucene to 2.0
  • New drivers
    • SharePoint Driver
    • Commons VFS Driver
    • NFS Driver
    • Pipermail driver (or use web crawler?)
  • Test other popular databases (i.e. MS-SQL)
  • Support for file embedded in RAR archives
  • RDBMS improvements
    • implemente getType() for automatic MIME-TYPE
    • extend RDBMS driver to reach external documents referenced by a db field
    • bundle JDBC most used drivers (Oracle, DB2, SQL Server, Sybase y MySQL).
  • Customer satisfaction monitoring when searching
    • Queries
    • Queries secuence within search session
    • Number of hits
    • Date and time
    • IP address or unique code
    • Searching reports for 100 most popular results
    • Searching reports for most popular searches that were not found
  • Statistics
    • Total up time
    • Total docs
    • Total searches served (? avg per day)
  • Domain-specific translation of Spring configuration files

Features v2

  • Support for more file types
    • Flash?
    • Autocad?
    • ...
  • Dynamic class-loading of parsers and drivers classes (using JMX).
  • Cached documents
  • Lightweight XML service from JSP (REST approach)
  • Direct feedback from user: let them make annotations and content scoring
  • Grouping results functionality similar to Google's "Similar Pages" (mainly for forum threads or content with lots of hits)
  • Automatic content ranking based on popularity
  • Federated network of Kneobase servers
  • Fine-grained security
  • Online visual configuration editing for each driver
    • Driver configuration wizards (notably RDBMS)
    • SPI extensions for supporting wizards and online editing
  • Thesauri (needs further research, it would be very interesting if Kneobase could provide an automatic classification for browsing knowledge base contents)
  • Implement a JMX interface for administration (needs research)
  • Save/restore server configuration (drivers/content sources/parsers/builders?)
  • Decouple driver configuration and connection
  • Versioning of configuration files
  • Webstart updates

 

Versión en Español | English version