Package org.apache.poi.hssf.extractor
Class EventBasedExcelExtractor
java.lang.Object
org.apache.poi.extractor.POITextExtractor
org.apache.poi.extractor.POIOLE2TextExtractor
org.apache.poi.hssf.extractor.EventBasedExcelExtractor
- All Implemented Interfaces:
Closeable
,AutoCloseable
,ExcelExtractor
A text extractor for Excel files, that is based
on the HSSF EventUserModel API.
It will typically use less memory than
ExcelExtractor
, but may not provide
the same richness of formatting.
Returns the textual content of the file, suitable for
indexing by something like Lucene, but not really
intended for display to the user.
To turn an excel file into a CSV or similar, then see the XLS2CSVmra example
- See Also:
-
Field Summary
Fields inherited from class org.apache.poi.extractor.POIOLE2TextExtractor
document
-
Constructor Summary
ConstructorsConstructorDescription -
Method Summary
Modifier and TypeMethodDescriptionWould return the document information metadata for the document, if we supported itWould return the summary information metadata for the document, if we supported itgetText()
Retreives the text contents of the filevoid
setFormulasNotResults
(boolean formulasNotResults) Should we return the formula itself, and not the result it produces? Default is falsevoid
setIncludeCellComments
(boolean includeComments) Would control the inclusion of cell comments from the document, if we supported itvoid
setIncludeHeadersFooters
(boolean includeHeadersFooters) Would control the inclusion of headers and footers from the document, if we supported itvoid
setIncludeSheetNames
(boolean includeSheetNames) Should sheet names be included? Default is trueMethods inherited from class org.apache.poi.extractor.POIOLE2TextExtractor
getDocument, getMetadataTextExtractor, getRoot
Methods inherited from class org.apache.poi.extractor.POITextExtractor
close, setFilesystem
-
Constructor Details
-
EventBasedExcelExtractor
-
EventBasedExcelExtractor
-
-
Method Details
-
getDocSummaryInformation
Would return the document information metadata for the document, if we supported it- Overrides:
getDocSummaryInformation
in classPOIOLE2TextExtractor
- Returns:
- The Document Summary Information or null if it could not be read for this document.
-
getSummaryInformation
Would return the summary information metadata for the document, if we supported it- Overrides:
getSummaryInformation
in classPOIOLE2TextExtractor
- Returns:
- The Summary information for the document or null if it could not be read for this document.
-
setIncludeCellComments
public void setIncludeCellComments(boolean includeComments) Would control the inclusion of cell comments from the document, if we supported it- Specified by:
setIncludeCellComments
in interfaceExcelExtractor
- Parameters:
includeComments
-true
if cell comments should be included
-
setIncludeSheetNames
public void setIncludeSheetNames(boolean includeSheetNames) Should sheet names be included? Default is true- Specified by:
setIncludeSheetNames
in interfaceExcelExtractor
- Parameters:
includeSheetNames
-true
if the sheet names should be included
-
setFormulasNotResults
public void setFormulasNotResults(boolean formulasNotResults) Should we return the formula itself, and not the result it produces? Default is false- Specified by:
setFormulasNotResults
in interfaceExcelExtractor
- Parameters:
formulasNotResults
-true
if the formula itself is returned
-
getText
Retreives the text contents of the file- Specified by:
getText
in interfaceExcelExtractor
- Specified by:
getText
in classPOITextExtractor
- Returns:
- All the text from the document
-