Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / Internationalization / November 2004

Tip: Looking for answers? Try searching our database.

Store in a file a web page written in chinese

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Antonio - 25 Oct 2004 09:10 GMT
Hi,
I want to read an html page written in chinese and store it in a file
having extension .aspx , I'm not sure where is the problem, I use the
following lines of code:

String sAddress = "http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http:/
/www.etantonio.it/EN/index.aspx
"
;

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

But the file produced didn't contain the chinese characters so, how
can I solve the problem???

Many Thanks in advance ...

                Ing. Antonio D'Ottavio
Nitin - 29 Oct 2004 09:26 GMT
file which u have might have the chinese character but u might not be seeing
??? because of improper font setting select the font for chinese language and
then check

> Hi,
> I want to read an html page written in chinese and store it in a file
[quoted text clipped - 23 lines]
>
>                  Ing. Antonio D'Ottavio
Antonio - 02 Nov 2004 15:57 GMT
With the following page aspx
I try to translate one my page from English to Chinese, using UTF8,
the result Is that the Chinese characters do not come read correctly,
if instead I insert directly the address
http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etanto
nio.it/en/index.aspx&lp=en_zh

into the browser the page he comes shown correctly in Chinese, if i
save it and put it in my site and with the same below script I try to
read it and to save it always with utf8, the Chinese characters come
saves you normally, than problem there is to your opinion? My scope is
to save in automatic way in a file with extension aspx the content of
the page http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etanto
nio.it/en/index.aspx&lp=en_zh


hello and thanks....
   Antonio D'Ottavio
   www.etantonio.it

<%@ Page Language="c#" debug="true" trace="true"%>
<%@ import Namespace="System" %>
<%@ import Namespace="System.IO" %>
<%@ import Namespace="System.Net" %>

<script runat="server">
   static string sLanguageSrc = "EN";
   static string sLanguageDest = "ZH";
   string PathDirectory ;           
    static FileInfo[] fi ;   

   void Page_Load(Object Src, EventArgs E )
   {
     String sAddressEncoded =
HttpUtility.UrlEncode("http://www.etantonio.it/en/index.aspx") ;
     String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=" +
sAddressEncoded + "&lp=" + sLanguageSrc + "_" + sLanguageDest ;
     WebRequest req = WebRequest.Create(sAddress);
     WebResponse result = req.GetResponse();
     Stream ReceiveStream = result.GetResponseStream();
     StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8
);
     String sHtmlTradotto = reader.ReadToEnd();

     String RegStringSymError =
"(?i)\\<script\\slanguage=\"JavaScript\"\\>(\\s\\n)*\\<!--(\\s\\n)*function\\sSymError\\(\\)(\\s|\\n)*{(\\s|\\n)*return\\strue;(\\s|\\n)*}(\\s|\\n)*window.onerror\\s=\\sSymError;(\\s\\n)*//--\\>(\\s\\n)*\\</script\\>";
     sHtmlTradotto = Regex.Replace(sHtmlTradotto, RegStringSymError,
"");
       Trace.Write("sHtmlTradotto", sHtmlTradotto);
     
     StreamWriter writer = new StreamWriter(
Server.MapPath("/Etantonio/EN/ZH_Tradotta.aspx") , false,
System.Text.Encoding.UTF8) ;
     writer.Write(sHtmlTradotto);
     writer.Flush();
     writer.Close();
   }

</script>

<html>
<head>
<title>Traduttore Cinese</title>
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
<META name="author" content="Antonio DOttavio">
<META name="keywords" content="Motore Ricerca Gif Animate, Animated
Gif, Gif Animate, Gif, Animated, WebMaster, Web, Azioni, Borsa,
Grafici, Criteri, Elettronica, Telecomunicazioni, Informatica,
Università, Economia, Finanza">
<meta name="description" content="Motore Ricerca Gif Animate, Animated
Gif">
<link href="../../Stili.css" rel="stylesheet" type="text/css">
</head>

<body>
</body>
</html>
Sylvain Lafontaine - 02 Nov 2004 18:44 GMT
Trying to display Chinese with the charset iso-8859-1?  If you want to
display Chinese, all of your page must be in Unicode and not only just a
part of it, the other part being in italian.

Replace iso-8859-1 with utf-8 and take at the following two articles
(especially the end of the first one).  The second one is there in case you
need to know the code page for UTF-8 (65001: Response.Codepage = 65001 or
Session.CodePage=65001 but Reponse.CharSet="UTF-8").

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sq
l_dataencoding.asp


http://support.microsoft.com/?kbid=232580

S. L.

> With the following page aspx
> I try to translate one my page from English to Chinese, using UTF8,
[quoted text clipped - 72 lines]
> </body>
> </html>
Antonio - 04 Nov 2004 08:58 GMT
Hi Sylvain,
I maded what you suggested, in my page named TraduttoreCinese, I
changed to utf-8 in fact now I have "charset=utf-8" and
System.Text.Encoding.UTF8 both for reading the page from the web and
for writing to a file, this is the code:

////////////////////////////////////////////////////////////////////////
<%@ Page Language="c#" debug="true" trace="true"%>
<%@ import Namespace="System" %>
<%@ import Namespace="System.IO" %>
<%@ import Namespace="System.Net" %>

<script runat="server">
   static string sLanguageSrc = "EN";
   static string sLanguageDest = "ZH";
   string PathDirectory ;           
    static FileInfo[] fi ;   

   void Page_Load(Object Src, EventArgs E )
   {
     String sAddressEncoded =
HttpUtility.UrlEncode("http://www.etantonio.it/en/index.aspx") ;
     String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=" +
sAddressEncoded + "&lp=" + sLanguageSrc + "_" + sLanguageDest ;
     WebRequest req = WebRequest.Create(sAddress);
     WebResponse result = req.GetResponse();
     Stream ReceiveStream = result.GetResponseStream();
     StreamReader reader = new StreamReader(ReceiveStream,
Encoding.UTF8 );
     String sHtmlTradotto = reader.ReadToEnd();
     Trace.Write("sHtmlTradotto", sHtmlTradotto);
     
     StreamWriter writer = new StreamWriter(
Server.MapPath("/Etantonio/EN/ZH_Tradotta.aspx") , false,
System.Text.Encoding.UTF8) ;
     writer.Write(sHtmlTradotto);
     writer.Flush();
     writer.Close();
   }

</script>

<html>
<head>
<title>Traduttore Cinese</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
</body>
</html>
///////////////////////////////////////////////////////////////////////////

still the result is not good in fact this is the result showing no
chinese character, result different from see directly on the browser
at the url:

http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etanto
nio.it/en/index.aspx&lp=en_zh


this instead is my ugly result:
///////////////////////////////////////////////////////////////////////////
sHtmlTradotto <html><meta http-equiv="content-type"
content="text/html; charset=UTF-8"><base
href="http://www.etantonio.it/en/index.aspx">
<!-- removed --><meta http-equiv="Content-Type" content="text/html ;
CHARSET=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<head>
<title>Etantonio</title>
<meta name="author" content="Antonio DOttavio">
<meta name="description" content="Etantonio Index">
<link href="Stili.css" rel="stylesheet" type="text/css">
</head>
<body>

<script language=JavaScript src="menu_array.js"
type=text/javascript></script>
<script language=JavaScript src="mmenu.js"
type=text/javascript></script>

<table width="750" height="430"  border="0" cellpadding="0"
cellspacing="0" background="/images/EsserSpettatoriNonEstSerioElefante.jpg">
<tr>
  <td valign="top">

<table width="90%" border="0" align="center" cellspacing="12">
<tr height="70" valign="top">
  <td>&nbsp;</td>
  <td width="25%" rowspan="2">
  <p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fUniversita%2findex.aspx
"
class="testoMedioVerde"></a></p>
  <p align="center" class="testoPiccolissimoVerde">, </p>
  </td>
  <td width="25%" rowspan="2">
  <p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx
"
class="testoMedioVerde"></a> </p>
  <p align="center" class="testoPiccolissimoVerde">, , , 1994
</p></td>
  <td width="25%">&nbsp;</td>
</tr>
<tr height="140" valign="top">
  <td width="25%">
  <p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx
"
class="testoMedioVerde"></a> </p>
  <p align="center" class="testoPiccolissimoVerde">, , </p>
</td>
  <td width="25%">
  <p align="center" ><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx
"
class="testoMedioVerde"></a> </p>
  <p align="center" class="testoPiccolissimoVerde">GIF , </p>
  </td>
</tr>
<tr  valign="top">
  <td width="25%">
  <p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx
"
class="testoMedioVerde"></a> </p>
  <p align="center" class="testoPiccolissimoVerde">, , , </p>
  </td>
  <td width="25%"> <div align="center"></div></td>
  <td width="25%"> <div align="center"></div></td>
  <td width="25%">
  <p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx
"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3
a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx
"
class="testoMedioVerde"></a></p>
  <p align="center" class="testoPiccolissimoVerde">nel delle </p>
</td>
</tr>
</table>

  </td>
</tr>
</table>
      <script>InserisciFooter();</script>
<br>

</body>
</html>
///////////////////////////////////////////////////////////////////////////
Sylvain Lafontaine - 04 Nov 2004 19:03 GMT
Hi,

   I didn't have the time to mount a full in my system right now.  However;
I can see this duplicate header:

sHtmlTradotto <html><meta http-equiv="content-type"
content="text/html; charset=UTF-8"><base
href="http://www.etantonio.it/en/index.aspx">
<!-- removed --><meta http-equiv="Content-Type" content="text/html ;
CHARSET=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Maybe IE is unable to see that the charset is indeed UTF-8.  Have you tried
to set the encoding directly to UNICODE-8 in the options of IE?

You should also try your code by first writing only the chinese page,
without your own writing, and also trying to use an IFrame.

S. L.

> Hi Sylvain,
> I maded what you suggested, in my page named TraduttoreCinese, I
[quoted text clipped - 161 lines]
> </html>
> ///////////////////////////////////////////////////////////////////////////

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.