Earlier this month , I was working on a small project for a client ,everyday I receive a couple of excel files that contain some data ,I have to .
- 1-Insert this data to a table.
2-Clean the data(empty rows removed ,rows missing data moved to another table).
3-check for duplicated data in this table and delete them.
4-check for duplicated data between this table and the original table and if there is duplication and move them to another ‘duplication table’.
Now , those steps are only for one received file ,if I have more I loose a lot of time .
So ,I said to my self ,why not build a small application or website that do all this work for me .
The requirements for my application is to browse for an excel file, Upload the file , read each row from the excel file and insert into a table , when it finish inserting a stored procedure will run a sql query that I have created ,
and of course I will host the web application on azure app service .
So , I developed the web application and tested on my machine and then publish it to azure and at this time ,
I run into the “It works on my machine”
I mean I tested the app , the connection string everything was fine , I even hosted on my local IIS and it worked smoothly .
So I said it’s time to investigate so I went to my app service , looked for “Diagnose and solve problems” and than found out this error : ‘The Microsoft.ACE.OLEDB.12.0 provider is not registered on the local machine‘.
Well , I’m using the OleDbDataAdapter
to read the data from the excel file into the dataset.
And it seems that the Microsoft.ACE.OLEDB.12.0 Provider is only installed on my machine and can not be available on the azure app service .
So , I had to quickly find a new way to read my excel file without using the OleDbDataAdapter
and that when I discovered a plugin : NPOI .
What is NPOI ?
NPOI is the .NET version of POI Java project at http://poi.apache.org/. POI is an open source project which can help you read/write xls, doc, ppt files. It has a wide application.
For example, you can use it to
- generate a Excel report without Microsoft Office suite installed on your server and more efficient than call Microsoft Excel ActiveX at background;
- extract text from Office documents to help you implement full-text indexing feature (most of time this feature is used to create search engines).
- extract images from Office documents
- generate Excel sheets that contains formulas .
You can install the package by running this command :
Install-Package DotNetCore.NPOI
Now , this is the way I was reading my excel file using the OleDbDataAdapter :
//get extension string extension = Path.GetExtension(filename); string conString = string.Empty; switch (extension) { case ".xls": //Excel 97-03. conString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties='Excel 8.0;HDR=YES'"; break; case ".xlsx": //Excel 07 and above. conString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties='Excel 8.0;HDR=YES'"; break; } DataTable dt = new DataTable(); conString = string.Format(conString, filePath); using (OleDbConnection connExcel = new OleDbConnection(conString)) { using (OleDbCommand cmdExcel = new OleDbCommand()) { using (OleDbDataAdapter odaExcel = new OleDbDataAdapter()) { cmdExcel.Connection = connExcel; connExcel.Open(); DataTable dtExcelSchema; dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); string sheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString(); connExcel.Close(); connExcel.Open(); cmdExcel.CommandText = "SELECT * From [" + sheetName + "]"; odaExcel.SelectCommand = cmdExcel; odaExcel.Fill(dt); var x = dt.Columns.AsParallel(); connExcel.Close(); } } }
Now using the NPOI package we can do the same reading and clear out our problem and we don’t have to worry about it again .
private static DataTable GetRequestsDataFromExcel(string fullFilePath) { try { var sh = GetFileStream(fullFilePath); var dtExcelTable = new DataTable(); dtExcelTable.Rows.Clear(); dtExcelTable.Columns.Clear(); var headerRow = sh.GetRow(0); int colCount = headerRow.LastCellNum; for (var c = 0; c < colCount; c++) dtExcelTable.Columns.Add(headerRow.GetCell(c).ToString()); var i = 1; var currentRow = sh.GetRow(i); while (currentRow != null) { var dr = dtExcelTable.NewRow(); for (var j = 0; j < currentRow.Cells.Count; j++) { var cell = currentRow.GetCell(j); if (cell != null) switch (cell.CellType) { case CellType.Numeric: dr[j] = DateUtil.IsCellDateFormatted(cell) ? cell.DateCellValue.ToString(CultureInfo.InvariantCulture) : cell.NumericCellValue.ToString(CultureInfo.InvariantCulture); break; case CellType.String: dr[j] = cell.StringCellValue; break; case CellType.Blank: dr[j] = string.Empty; break; } } dtExcelTable.Rows.Add(dr); i++; currentRow = sh.GetRow(i); } return dtExcelTable; } catch (Exception e) { throw; } } private static ISheet GetFileStream(string fullFilePath) { var fileExtension = Path.GetExtension(fullFilePath); string sheetName; ISheet sheet = null; switch (fileExtension) { case ".xlsx": using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read)) { var wb = new XSSFWorkbook(fs); sheetName = wb.GetSheetAt(0).SheetName; sheet = (XSSFSheet)wb.GetSheet(sheetName); } break; case ".xls": using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read)) { var wb = new HSSFWorkbook(fs); sheetName = wb.GetSheetAt(0).SheetName; sheet = (HSSFSheet)wb.GetSheet(sheetName); } break; } return sheet; }
After this , I republished the project and everything worked fine .
well hope this was helpful ,time to get some food .
have a good day 🙂