Fortsätt till huvudinnehåll

String comparison in C#

In C#, a string is a sequential read-only collection of Char objects. A Char object is an instance of a Char struct, which represents a character as a UTF-16 code unit.

Most programs, and programmers, do not care much about the internal representation of a string and a Char. As long as you know that strings are immutable and there is a big efficiency penalty involved in building up strings in loops without involving a mutable object, such as a StringBuilder, you are mostly fine.

However, there are some things that are good to know when working with strings. One of these things is comparison using methods such as String.CompareTo and String.Compare. By default the Compare and CompareTo methods are Culture sensitive. This means, that depending on the language and culture setting of the machine running the application, comparison and sorting of strings might be different than on your development machine. This might be exactly what you want, or it might come as a complete surprise.

The best thing to do here is to override the default behavior and be specific in your code on how you want your application to behave.

So, do not use the CompareTo method at all, instead use Compare, but not the default that only takes two strings, but the version where you explicitly define which type of comparison you want to use. This is, Compare(String, String, StringComparison).

For more info:
https://docs.microsoft.com/en-us/dotnet/api/system.string

Kommentarer

Populära inlägg i den här bloggen

Does TDD really improve software quality?

I have asked myself this question several times, and searched for answers, without coming up with any clear answer. Therefore I have decided to go hard core TDD for a longer period of time (at least 6 months) to really evaluate the effects. There are several things that I find confusing when it comes to TDD. One example is what actually defines a unit test. What is a "unit" anyway? After reading a bit about it I found a text claiming that the "unit" is "a unit of work", i.e. something quite small. Like converting a string to UPPERCASE or splitting a string into an ['a','r', 'r', 'a', 'y'] of chars. This work is usually performed by a single call to a single method in a single, isolated, class. So, what does it mean that a class is isolated? Does it mean that it doesn't have any dependencies to other classes? NO! In the context of TDD it means that any dependencies are supplied by the test environment, for exa...

Codility tasks - Part I

I was recently faced with two codility tasks when applying for a job as an Embedded Software Engineer. For those of you who arn't familiar with Codility you can check out their website here:  www.codility.com Task one - Dominator The first task was called Dominator. The goal was to, given a std::vector of integers, find an integer that occurs in more than half of the positions in the vector. If no dominator was found -1 should be returned. My approach was to loop through the vector from the first to the last element, using a std::map to count the number of occurences of each integer. If the count ever reached above half the size of the vector I stopped and returned that integer and if I reached the end without finding a dominator I returned -1. So was that a good approach? Well, the reviewer at the company rated the solution as 'pretty ok'. His preferred solution was store the first integer in the array and set a counter to 1. Then loop through the remaining i...

Codility tasks - Part II

Now, the second codility task I was faced with was a bit tougher. The goal was to create a function that, given a vector of integers A and an integer K, returned the number of integer pairs in the vector that, when added, sums up to K. Let me give you an example. Assume that you are given a vector A = [0, -1, 3, 2, -5, 7] and K = 2. Possible combinations to get K are (0, 2), (-1, 3), (3, -1), (2, 0),  (-5, 7), and (7, -5). In other words, the function should return 6. Now, how did I solve this task? The first solution that came to mind involved nested for-loops. The outer loop picking one integer at the time from the vector and the inner loop adding the integer to the others one by one to see if the result is K. This solution works, but it does not scale well. Time complexity will be O(N**2) ,   something that for large vectors will result in very long execution times. My second approach was to use my old friend, the integer counter, and count all occurences of each...