Introduction To Apache POI – The Java API To Read Write Microsoft Documents – Excel

Apache POI provides pure Java libraries for reading and writing files in Microsoft Office formats, such as Word, PowerPoint and Excel. It has different components for reading and writing different Microsoft Office files:

  1. POIFS for OLE 2 Documents
  2. HSSF and XSSF for Excel Documents
  3. HWPF and XWPF for Word Documents
  4. HSLF and XSLF for PowerPoint Documents
  5. HPSF for OLE 2 Document Properties
  6. HDGF and XDGF for Visio Documents
  7. HPBF for Publisher Documents
  8. HMEF for TNEF (winmail.dat) Outlook Attachments
  9. HSMF for Outlook Messages

Hope now you get to understand that Apache POI is not only to read and write Excel sheet. It provides different components to read different Microsoft office files. So, if interviewer asks you that how you read an excel file in Java, your answer should be – ” We use HSSF and XSSF component provided by Apache POI to read and write excel in Java”.

You will find excel with two extensions:-

  1. .xls ( Microsoft Excel 2003 file )
  2. .xlsx (Microsoft Excel 2007 file or later )

Apache POI provides different components for each type of extension of excel, which are given below:

  1. .xls – HSSF ( Horrible SpreadSheet Format )- You need to download “poi” jar files.
  2. .xlsx – XSSF ( XML SpreadSheet Format )- You need to download “poi-ooxml ” jar files.

If you do not want to read write excel based on its extension, you can use common spreadsheet usermodel which requires poi-ooxml and core poi libraries. Common spreadsheet is used widely now as it can read excel with any extension with same lines of code. No worries of using HSSF or XSSF for workbook and sheet.

Let’s update answer of above discussed question:

So, if interviewer asks you that how you read an excel file in Java, your answer should be – ” We use HSSF, XSSF or Common spreadsheet (SS) usermodel provided by Apache POI to read and write excel in Java”. Common SS usermodel can read/write excel with both extensions .xls and .xlsx.

In next post, we will explore more about it. Stay tuned.

In case of any doubt, suggestion or you find some mistake, feel free to let me know in comments.

#ThanksForReading

#DonateIfYouThinkMyPostsHelpYou

Author: Amod Mahajan

My name is Amod Mahajan and I am an IT employee with 6+ years of experience in Software testing and staying in Bengaluru. My area of interest is Automation testing. I started from basics and went through so many selenium tutorials. Thanks to Mukesh Otwani as his tutorials are easy and cover basics to advance. I have habit of exploring concepts by deep diving. I used to make notes. I thought of sharing my knowledge through posts and now I am here. #KeepLearning #ShareLearning