I have a utf-8 PHP file handled with Notapad (w2k), which suddenly started
showing a gap at the top of the IE6 screen.
Viewing the source code via View-Source shows a square it the beginning of
the file, which, I guess, is BOM
How do I remove it?
Jochen Kalmbach - 02 Oct 2004 17:23 GMT
> I have a utf-8 PHP file handled with Notapad (w2k), which suddenly
> started showing a gap at the top of the IE6 screen.
> Viewing the source code via View-Source shows a square it the
> beginning of the file, which, I guess, is BOM
> How do I remove it?
Open it in notepad and save it as "ANSI".

Signature
Greetings
Jochen
My blog about Win32 and .NET
http://blog.kalmbachnet.de/
aa - 02 Oct 2004 17:58 GMT
Open it in notepad and save it as "ANSI".
Then I will loose all the non-ANSI data?
Jochen Kalmbach - 02 Oct 2004 18:07 GMT
> Open it in notepad and save it as "ANSI".
>
> Then I will loose all the non-ANSI data?
Yes.

Signature
Greetings
Jochen
My blog about Win32 and .NET
http://blog.kalmbachnet.de/
Michael (michka) Kaplan [MS] - 02 Oct 2004 19:11 GMT
The BOM is not visible in Internet Explorer any time that either:
a) IE recognizes the file format (which is to say, usually), or
b) the code point is in the font as a ZERO WIDTH NO BREAK SPACE (which is
again to say, usually)
You can try right-clicking on the page and verifying the encoding in the
[unlikely] event that both (A) and (B) are not true.

Signature
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division
This posting is provided "AS IS" with
no warranties, and confers no rights.
> I have a utf-8 PHP file handled with Notapad (w2k), which suddenly started
> showing a gap at the top of the IE6 screen.
> Viewing the source code via View-Source shows a square it the beginning of
> the file, which, I guess, is BOM
> How do I remove it?
aa - 03 Oct 2004 11:27 GMT
Thanks,
right-clicking --> encoding shows Unicode (UTF-8)
However the file in question is a PHP file wich includes another PHP UTF-8
files at the very begining ising the PHP operator include.
That second file, I guess, has its own BOF which is located somewhere after
the first BOF, which might render it visible in the browser
In a non-Unicode text editor this shows up as 
Some time ago I run across similat problem with ASP, but cannot remember how
I got round it.
> The BOM is not visible in Internet Explorer any time that either:
>
[quoted text clipped - 11 lines]
> > the file, which, I guess, is BOM
> > How do I remove it?
Michael (michka) Kaplan [MS] - 04 Oct 2004 02:37 GMT
Those are the bytes of a BOM -- and what they would look like if it was not
detected as UTF-8 (which is not an issue in IE, by your own admission).
If you are combining files in something and no one is removing the
superfluous BOM then make sure you see it with a font that recognizes it is
a ZERO WIDTH NO BREAK SPACE. I know that it can read Unicode in UTF-8 (since
you claim there are many international characters in the file?).
In other words, everything you have discussed so far should have no problem.
Eventually you masy need to ask the question a more relevant forum for the
the responsible technology (PHP?).

Signature
MichKa [MS]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure and Font Technologies
Windows International Division
This posting is provided "AS IS" with
no warranties, and confers no rights.
> Thanks,
> right-clicking --> encoding shows Unicode (UTF-8)
[quoted text clipped - 24 lines]
> > > the file, which, I guess, is BOM
> > > How do I remove it?
Joerg Jooss - 02 Oct 2004 20:10 GMT
>I have a utf-8 PHP file handled with Notapad (w2k), which suddenly started
> showing a gap at the top of the IE6 screen.
> Viewing the source code via View-Source shows a square it the beginning of
> the file, which, I guess, is BOM
> How do I remove it?
Certain editors like SciTE allow you to save UTF-8 files either with our
without BOM.
Cheers,

Signature
Joerg Jooss
joerg.jooss@gmx.net
Jeremy Pullicino - 11 Oct 2004 12:45 GMT
Is the BOM the special hex numbers found at the begining of UTF-8 files when
saved with notepad?
Are UTF-8 text files with no BOM valid utf-8? If so, how can my application
detect that a file is in UTF-8 or ANSI?
Jeremy.
> >I have a utf-8 PHP file handled with Notapad (w2k), which suddenly started
> > showing a gap at the top of the IE6 screen.
[quoted text clipped - 6 lines]
>
> Cheers,
Joerg Jooss - 11 Oct 2004 21:25 GMT
> Is the BOM the special hex numbers found at the begining of UTF-8
> files when saved with notepad?
Yes. Notepad always prepends the BOM.
> Are UTF-8 text files with no BOM valid utf-8?
Yes. A UTF-8 BOM is optional.
> If so, how can my
> application detect that a file is in UTF-8 or ANSI?
That's impossible. Even a BOM is a valid (though rather likely meaningless)
character sequence in ANSI (and the next question would be what's ANSI?
Windows 1252? Windows 1250?).
Cheers,

Signature
Joerg Jooss
www.joergjooss.de
news@joergjooss.de