Jump to content

How can I clean html tags?


motovlc

Recommended Posts

I am creating from Access an XML file , and saving it in the server, using this piece of code: -------------<%Dim objConn, strConnect, strSQL, rs, tb, mdbFile, objFSO, xmlFile, objWrite xmlFile = Server.MapPath("myxmlfile.xml")mdbFile = Server.MapPath("/database/database.mdb") tb = chr(9) set objFSO = Server.CreateObject( "Scripting.FileSystemObject" )Set objConn = Server.CreateObject( "ADODB.Connection" ) objConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & mdbFile If Not objFSO.FileExists( xmlFile ) Then objFSO.CreateTextFile( xmlFile )set objWrite = objFSO.OpenTextFile( xmlFile, 2 ) objWrite.WriteLine("<?xml version=""1.0"" encoding=""ISO-8859-1""?>")objWrite.WriteLine("<data>") strSQL = "SELECT * FROM products WHERE 1=1"Set rs = objConn.Execute(StrSQL) Do While not rs.EOF objWrite.WriteLine(tb & "<item>") objWrite.WriteLine(tb & tb & "<catalogid>" & rs("catalogid") & "</catalogid>") objWrite.WriteLine(tb & tb & "<name>" & replace(rs("name"),"&","&") & "</name>") objWrite.WriteLine(tb & tb & "<price>" & replace(rs("price"),"&","&") & "</price>")objWrite.WriteLine(tb & tb & "<description>" & replace(rs("description"),"&","&") & "</description>") objWrite.WriteLine(tb & "</item>") rs.MoveNextLoop objWrite.WriteLine("</data>")objWrite.Close()response.write "XML file updated"%> --------------- It works great but, I have a problem. In the field "description" in the db, I have data stored with html tags, then the resulting XML file is something like this: <?xml version="1.0" encoding="ISO-8859-1"?><data><item><catalogid>1</catalogid><name>PRODUCT HERE</name><price>89</price><description><P>Description here.<BR><SPAN style="COLOR: #ffa500; FONT-SIZE: 10pt" itxtharvested="0" itxtnodeid="309"><STRONG onmouseover="ddrivetip('*some text')" onmouseout=hideddrivetip() itxtharvested="0" itxtnodeid="310" ;>Description here <SPAN style="COLOR: #000000; FONT-SIZE: 8pt">*</SPAN><BR itxtnodeid="311"></STRONG></SPAN><STRONG itxtharvested="0" itxtnodeid="307">Description here</STRONG><BR itxtnodeid="306">12V <BR>Description here<BR>Description here<BR>Description here<BR><SPAN style="FONT-SIZE: 8pt">Wh: 15,4<BR itxtnodeid="304"></SPAN><SPAN style="COLOR: #ff0000" itxtharvested="0" itxtnodeid="303">Description here</SPAN> </P><P><SPAN style="COLOR: #000000; FONT-SIZE: 8pt" itxtharvested="0" itxtnodeid="303">Description text here<BR itxtnodeid="312"></SPAN></P></description></item></data> What I need to do, is to clean the html tags that I have in the description field before XML is created. Can anybody help?Thanks

Link to comment
Share on other sites

I'm not sure if you'll be able to find recent code for ASP/VBscript, but you should find at least some examples of stripping HTML online. http://www.google.com/search?client=opera&rls=en&q=asp+vbscript+clean+html&sourceid=opera&ie=utf-8&oe=utf-8&channel=suggest Make sure to read a few of those, cleaning HTML to protect against things like XSS attacks is a pretty broad problem that doesn't have a small, simple solution. ASP classic is old enough now that most people are probably focusing their efforts on other languages.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...