Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / New Users / November 2006

Tip: Looking for answers? Try searching our database.

string.Trim() and White spaces list?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
adi - 07 Nov 2006 06:48 GMT
Hi

I'm working on a documentation on my application.
I need to explain the reader that the white spaces will be removed from
a text.
I use string.Trim() method. Note: no arguments passed to the method.
It is not enough to tell this to an untrained person; I need to tell
him the complete list of white spaces, like:
1. space: ' '
2. tab: '\t'

My knowledge of what "whitespace" means stops here: space character and
tab character. What else?
May I dynamically query the framework the complete list of whitespaces?
I'm only able to test a particular character if it's a whitespace or
not (using char.IsWhiteSpace(...))

Thanks.
Morten Wennevik - 07 Nov 2006 07:19 GMT
Hi Adi,

There is a list of whitespace characters under the documentation for  
String.Trim()

http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx

> Hi
>
[quoted text clipped - 14 lines]
>
> Thanks.

Signature

Happy Coding!
Morten Wennevik [C# MVP]

adi - 07 Nov 2006 07:42 GMT
Thanks Morten

The list is very useful.
Now, for the second part of my question: is there a possibility to get
this list in runtime?
Note: I'm (still) using the 1.1 version of the framework, but solutions
for later versions are welcome.

Thanks.
Adi.

Morten Wennevik a scris:
> Hi Adi,
>
[quoted text clipped - 21 lines]
> >
> > Thanks.
Morten Wennevik - 07 Nov 2006 08:16 GMT
The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too.

As for getting this list at runtime I don't see how you can do that other  
than testing for Char.IsWhiteSpace for a whole range of numbers, which may  
take some time to compute.  I did a few tests and I ended up with a list  
with far more characters than listed under String.Trim when using  
Char.IsWhiteSpace.

Why do you need this list programmatically anyway?

> Thanks Morten
>
[quoted text clipped - 40 lines]
>> Happy Coding!
>> Morten Wennevik [C# MVP]

Signature

Happy Coding!
Morten Wennevik [C# MVP]

Morten Wennevik - 07 Nov 2006 08:59 GMT
Actually, you can't use IsWhiteSpace to determine which caracter is  
trimmed or not as there are whitespace characters that are not trimmed.  
Furthermore, there are characters that are trimmed but still not listed in  
the documentation.

In the end, to get the proper list you may need to try to trim every  
single character to determine if it will be trimmed with String.Trim()

The code below will display which characters are considered whitespace and  
which will be trimmed.

            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < 65535; i++)
            {
                char c = (char)i;
                string s = c.ToString();

                if (char.IsWhiteSpace(c) || s.Trim().Length == 0)
                {
                    sb.Append(i.ToString("X").PadLeft(4, '0'));
                    if (char.IsWhiteSpace(c))
                        sb.Append("\tWhiteSpace");
                    else
                        sb.Append("\t\t");
                    if (s.Trim().Length == 0)
                        sb.Append("\tTrimmed");
                    sb.AppendLine(); // use sb.Append("\r\n"); for .Net 1.1
                }
            }
            MessageBox.Show(sb.ToString());

Compared to the documentatet list this indicates that U+0085, U+1680,  
U+2028, U+2029 will also be trimmed, despite not being listed, while  
whitespace characters U+180E, U+202F, U+205F will not be trimmed.  
Characters U+200B and U+FEFF is not considered whitespace characters but  
will be trimmed anyway.

Upon even further research, in .Net 1.1 the list is correct and only  
documented characters will be trimmed, but the documentations have not  
been updated for .Net 2.0

> The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too.
>
[quoted text clipped - 50 lines]
>>> Happy Coding!
>>> Morten Wennevik [C# MVP]

Signature

Happy Coding!
Morten Wennevik [C# MVP]

adi - 07 Nov 2006 10:59 GMT
Many thanks

Morten Wennevik a scris:
> Actually, you can't use IsWhiteSpace to determine which caracter is
> trimmed or not as there are whitespace characters that are not trimmed.
[quoted text clipped - 91 lines]
> >>> Happy Coding!
> >>> Morten Wennevik [C# MVP]

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.