September 13, 2012

Solution for Parsing an Excel file in C# and the cells seem to get cut off at 255 characters


Problem: An excel sheet containing say 10 rows and in couple of columns in couple of rows having data with more than 255 characters. When I try to upload this file in ASP.Net application using traditional OLEDB provider I cannot saw the whole data in particular cell means which is chopping out or we can say trimming to 255 characters when loaded in to my dataset object.

Solution: Having done sleepless nights I found some solutions to fix this issue… those are presented below

Solution-1:

If your application is used only on one computer then you can directly go to the following registry settings and change the TypeGuessRows  value.

HKLM\Software\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows

64 bit systems:

HKLM\SOFTWARE\wow6432node\microsoft\jet\4.0\engines\excel\TypeGuessRows

By setting this value to zero, the all lines of your spreadsheet are scanned for type guessing, rather than the default of 8. If any text fields longer than 255 chars are encountered, then those columns are deemed to be memo fields.

Note that you are still not 100% guaranteed to get the right data types, depending on your data.

Note also the HKLM scope of this key though - it will affect every OleDB Excel import by any process on that machine and this lead to degrade performance depending on the size of the data.

Solution-2:

A second way to work around this problem (without modifying the registry) is to make sure that rows with fields, which have data 255 characters or greater, are present in the first 8 rows (default value of TypeGuessRows is 8) of the source data file.

 

Solution-3:

This is the recommended solution by me as there is no need to change any registry or take care to have those lengthy data to be in first 8 rows. Instead we have a tool called NPOI which can be download from npoi.codeplex.com.

Using this dll we can upload the spreadsheet without worrying of chopping your data and also it has many features like creating the spreadsheet on fly including charts, reports etc.., for more information you can find on this site npoi.codeplex.com.

Anyway reading data using this NPOI is different from the traditional OLEDB provider. Please find the following method which with return a Data Table object by sending the File Path and the respective SheetName as input.

public static DataTable getExcelData(string FileName, string strSheetName)

    {

        DataTable dt = new DataTable();

        HSSFWorkbook hssfworkbook;

        using (FileStream file = new FileStream(FileName, FileMode.Open, FileAccess.Read))

        {

            hssfworkbook = new HSSFWorkbook(file);

        }

 

        ISheet sheet = hssfworkbook.GetSheet(strSheetName);

        System.Collections.IEnumerator rows = sheet.GetRowEnumerator();

       

        while (rows.MoveNext())

        {

            IRow row = (HSSFRow)rows.Current;

 

            if (dt.Columns.Count == 0)

            {

                for (int j = 0; j < row.LastCellNum; j++)

                {

                    dt.Columns.Add(row.GetCell(j).ToString());

                }

 

                continue;

            }

 

            DataRow dr = dt.NewRow();

            for (int i = 0; i < row.LastCellNum; i++)

            {

                ICell cell = row.GetCell(i);

 

                if (cell == null)

                {

                    dr[i] = null;

                }

                else

                {

                    dr[i] = cell.ToString();

                }

            }

            dt.Rows.Add(dr);

        }

 

        return dt;

    }

 

May be I presented the solutions straight forward without more explanation or discussion but presently my motto is to provide you the reasonable and permanent solution for those who suffering with similar problem.

Hope this research and the code helps you a lot.. if so please drop a comment below which may be more encroached for me..

 

Happy Coding J   

 

No comments:

Post a Comment