c# - Parsing amount strings into numbers -
I am working on a system that recognizes paper documents using OCR engines. These documents are invoices containing amounts like total, VAT and Pure quantity. I need to parse these zodiac numbers, but they are using different symbols in decimal numbers and thousands to be different in each invoice in many formats and flavors. Are there. If I double the normal tryparse and double. I am trying to use PRS methods, they are usually unsuccessful for some amounts.
These are some examples that I get in terms of amount
< Code> "3.533.65" => 3533.65 "-133.696" = & gt; -133696 "-33.017" = & gt; -33017 "-166.713" = & gt; -166713 "-50888" => -5088.8 "0.423" = & gt; 0.423 "9,215,200" => 9215200 "1,443,840.00" => 1443840
I think what is the decimal separator and thousand divisor number and then the user has to decide whether this is correct or not. >
I am thinking how to solve this problem in a great way.
I would probably set a list of rules specified in order of preference, thus you can precede Plug rules you can re-parse the list based on regex matches returned again.
Setting up like a quick prototype will be very easy:
get public class FormatRule {public string pattern { Set; } Knowledge of public culture; Set; } Public format rules (string patterns, culture info culture) {pattern = pattern; Culture = culture; }}
Now a list of formats used to store your rules in the order of preamble:
list & lt; FormatRule & gt; Rules = New List & lt; FormatRule & gt; () {/ * For this example, N-US and FR-FR * is selected, but there can be equally any culture for you * Swap * * / new format (@ "0") in different formats. There may be a need to use the \ "+ d + $ \", Culture Info, GatCalCureInfo ("N-US"), new formattoly (@ "^ 0, \ d + $", cultureInfo.GetCalcherInfo ("FR-FR" ), New Formattoly (@ "^ [1-9] +. \ D {4,} $", Culture Info. GatecultureInfo ("N-US")), New format Roulette (@ "^ [1- 9] +, \ d {4,} $", Culture Info. GatCalcherInfo ("FR-FR"), New Formattoly (@ "^ - [1- 9] {1,3 } (\ \ {3,}} * (\. \ D *)? $ ", Culture InfoGetCalcherInfo (" N-US "), New FormatRaille (@" ^ - [[1- 9] {1 , 3} (. \ D {3,}) * (\, \ d *) "$", Culture Info. GETCalcherInfo ("FR-FR"), / * Default rule * / New format Rule (string.Empty , CultureInfo.CurrentCulture)}
Then you should be able to apply the correct rule to implement your list:
Public CultureInfo FindProvi Day (String number string) {foreach (Rules Rules) {if (Regex.IsMatch (numberString, rule.Pattern)) back rules. Agriculture; } Rule of return [Rule. Calculation - 1]. }
This setup allows you to easily manage the rules and preset when something should be done in one way or another. This enables you to be able to handle different cultures one way and a different format.
Public float parsevela (string value estrangement) {float value = 0; Number style style = number style. anyone; IFormatProvider Provider = SearchCulture (Value String) .NumberFormat; If (float.TryParse (number string, style, provider, value out)) return value; And then throw the new InvalidCastException (string.Format ("value '{0}' can not be parsed with any provider in the rule set. ', ValueString);}
end In order to convert your parsevela () method to the string value for a float, call:
string number string = "-123,456.78" // or "23.457.234,87 "Float value = parsevela (number string);
You have to use a dictionary to save on additional formatting classes. The concept is the same ... I used a list in the example because it makes LINQ easier to use. Besides, you can easily change a type of float type which is needed. For single, double or decimal.
Comments
Post a Comment