C# .NET - Reading HTML data and writing to a text file?
Asked By svt gdwl on 03-Apr-09 03:51 AM
I just want to read the table data in a HTML file and Write as
it is into a text file as it viewed in HTML view, I mean I don't want
retrieve the soucrce code, but I want the text as well as empty spaces
between each table cell in the text file also.
How can I do this in c#?
Re - Kalit Sikka replied to svt gdwl on 03-Apr-09 04:05 AM
Vasanthakumar D replied to svt gdwl on 03-Apr-09 04:12 AM
here is the code for this...
StreamReader str = new StreamReader("E:\\test.html");
string strLings = str.ReadToEnd();
int startIndex = strLings.IndexOf("<table>");
int endInedx = strLings.IndexOf("</table>") + "</table>".Length - startIndex;
string strTab = strLings.Substring(startIndex, endInedx);
StreamWriter strWr = new StreamWriter("E:\\test2.txt", true);
extract text contents of a HTML table - mv ark replied to svt gdwl on 03-Apr-09 04:16 AM
To rephrase your question as per my understanding, you need to extract the text contents of a HTML table. One way is use regular expressions to match & extract just the text.
You can adapt the code sample from this link - http://social.msdn.microsoft.com/Forums/en-US/regexp/thread/389b5bb0-b68f-4e4e-ba9f-cbecf7a86b67
thambi..nee salem TPT college aa? - Stella Pandian replied to Vasanthakumar D on 03-Apr-09 04:24 AM
smaple code - Sathish S replied to svt gdwl on 03-Apr-09 04:27 AM
re - Web Star replied to svt gdwl on 03-Apr-09 04:52 AM
u first read the html page which u want to convert into text as
string fileName = Server.MapPath("") + "/temp.html";
FileStream stream = new FileStream(fileName, FileMode.Open, FileAccess.Read);
StreamReader reader = new StreamReader(stream);
after the u can use the regular expression for remove all html tag related to table like <table>,<tr>,<td>....and so on.
Regex regex = new Regex(@"<tr>([^<]|(<[^t])|(<t[^r])|(<tr[^>]))*" +@"title=.*?</tr>",RegexOptions.Singleline | RegexOptions.Multiline);
hope this help u
no, why? - Vasanthakumar D replied to Stella Pandian on 03-Apr-09 05:11 AM
this is not a private messages board.. dont ask anything like this... :)
Vasanthakumar D replied to Stella Pandian on 03-Apr-09 05:12 AM
illa pa enoda junior anda collegle padicha...avanaa nee en pathen....private messageboard engirukku?
Stella Pandian replied to Vasanthakumar D on 03-Apr-09 05:17 AM
there is no PM here.... - Vasanthakumar D replied to Stella Pandian on 03-Apr-09 05:24 AM
in the text file also, table data should be rendered in same manner - svt gdwl replied to Web Star on 03-Apr-09 08:20 AM
If I read the table data which is present in the HTML page in the following way:
he following figure shows how the table defined in this example renders.
after reading this table from html page, how to write in a text file
such that the table's content should be renderd in same manner in the
text file also.
re - Web Star replied to svt gdwl on 03-Apr-09 08:43 AM
as your need u need the read html and then remove all thing from that string exccept the actual data value. and place the blank space
yes, how can I do that? - svt gdwl replied to Web Star on 03-Apr-09 10:57 AM