Lune Logo

© 2025 Lune Inc.
All rights reserved.

support@lune.dev

Want to use over 200+ MCP servers inside your coding tools like Cursor?

Asked 1 month ago by CosmicWanderer491

How can I extract specific field values from a multiline string using regex in C#?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

Input string

[Name]
Jhon

[Age]
45

[MobileNumber]
1020304050

Billing address                     Delivery address 
India                               India

I need to extract values based on the labels provided in the string. For example, when I query [Name] the result should be "Jhon" and for Delivery address it should yield "India". However, my code is returning "India" for every field.

Code

CSHARP
static void Main(string[] args) { string strContent = @"[Name] Jhon [Age] 45 [MobileNumber] 1020304050 Billing address Delivery address GJ-India MH-India"; string value = string.Empty; var match = Regex.Match(strContent, @"\[Name\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value; } Console.WriteLine(value); //Jhon match = Regex.Match(strContent, @"\[Age\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value; } Console.WriteLine(value); //45 match = Regex.Match(strContent, @"\[MobileNumber\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value; } Console.WriteLine(value); //1020304050 match = Regex.Match(strContent, "Billing address (.*)", RegexOptions.Multiline); Console.WriteLine(value); //India match = Regex.Match(strContent, "Delivery address (.*)", RegexOptions.Multiline); Console.WriteLine(value); //India Console.ReadLine(); }

Expected output

If I pass [Name] then the result should be "Jhon"
Similarly, for Delivery address the expected result is "India"

Actual output

The code currently returns "India" for every field.

I would appreciate any help in adjusting the regex patterns and code logic to extract the correct values.

1

Comments 0

Answers 3

Views 106

3 Answers

Answer by StackOverflowTychoAI Agent 1 month ago

0

Below is an explanation of what’s happening and one way to fix the issue.

What’s Happening

In your code you are extracting values for Name, Age, and MobileNumber correctly because you assign the result to the variable “value” immediately after matching. However, for Billing address and Delivery address you are:

  1. Not checking if the match succeeded (i.e. no if (match.Success) block).
  2. Not updating the “value” variable with the extraction result.
  3. Reusing the same variable “value” without assigning a new value in those two regex calls.

So even though you run new regex matches, you never assign a new value to “value.” This is why you keep printing the previously assigned value (or the value doesn’t change as expected).

How to Fix It

You need to:

  • Check if the match is successful.
  • Assign the matched group value to "value" for Billing address and Delivery address.
  • Use a proper regex pattern for these cases.

For example, if your input string is conceptually written as:

Billing address                     Delivery address 
India                               India

You might want to capture the address on the next line rather than on the same line. One approach is to use a regex that looks for the label (e.g., "Billing address") then captures the next non-empty line.

Corrected Code Example

Below is one revised version of your code using separate regex patterns and proper assignments:

CSHARP
using System; using System.Text.RegularExpressions; class Program { static void Main(string[] args) { string strContent = @"[Name] Jhon [Age] 45 [MobileNumber] 1020304050 Billing address Delivery address India India"; // Extract Name string value = string.Empty; var match = Regex.Match(strContent, @"\[Name\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value.Trim(); } Console.WriteLine("Name: " + value); //Jhon // Extract Age match = Regex.Match(strContent, @"\[Age\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value.Trim(); } Console.WriteLine("Age: " + value); //45 // Extract MobileNumber match = Regex.Match(strContent, @"\[MobileNumber\]\s*(.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value.Trim(); } Console.WriteLine("MobileNumber: " + value); //1020304050 // Extract Billing address // This regex looks for "Billing address" and then captures the following line that has the address. match = Regex.Match(strContent, @"Billing address\s*(?:\r?\n)+\s*(\S.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value.Trim(); } Console.WriteLine("Billing address: " + value); //India // Extract Delivery address match = Regex.Match(strContent, @"Delivery address\s*(?:\r?\n)+\s*(\S.*)", RegexOptions.Multiline); if (match.Success) { value = match.Groups[1].Value.Trim(); } Console.WriteLine("Delivery address: " + value); //India Console.ReadLine(); } }

Explanation of the Changes

  • Trim() Method: This ensures that any leading or trailing whitespace is removed.
  • Regex for Addresses:
    • The pattern Billing address\s*(?:\r?\n)+\s*(\S.*) does the following:
      • Finds the text "Billing address"
      • Looks for one or more newline characters after it (using (?:\r?\n)+)
      • Captures the first non-whitespace character on the new line and the rest of that line ((\S.*))
  • The same logic is applied for Delivery address.

This corrected code will now extract and print the correct values as expected.

Summary

Make sure to check for a successful regex match and update your variable accordingly for each field you want to extract. Adjusting the regex pattern to capture the value on the next line for addresses fixes the issue with incorrect values being used repeatedly.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by VenusianStargazer296 1 month ago

0
  • [Age] doesn't mean "the string Age", but "any of the 3 characters A, g, e". You have to put a backslash before the [ and ] to match a litteral one (well, you'll put two backslashes before the [ and ], as you are inside a " which itself asks for a backslash to escape the one you want to pass to the Regex)
  • \[Age\] (.*) would mean "the [Age] string followed by a space followed by the data line. You don't have any space after "[Age]" (but directly the end-of-line instead) so it won't match. Replace the space by a newline.
  • You don't need Multiline, as this will only change the meaning of ^ and $ that you don't use.
  • You put the match into the match variable, but then you WriteLine value instead (which still has the value of the match on [Name]). Use match.Groups[1].Value.
  • Your last match is not as simple as the others, as you change from a "[Field1] Field1Value [Field2] Field2Value" format to a tabular one, with reads "Field1 Field2 Field1Value Field2Value". Thus your Field2Value does not follow Field2, you'll have to detect the header line, find the position of your field name in it, and look for the same column number in the next line.

The INI-style part

So for the "[Field] Value" part, each one of your blocks will become:

CSHARP
match = Regex.Match(strContent, "\\[Age\\]\n(.*)"); Console.WriteLine(match.Groups[1].Value);

You can see the full solution (including the tabular part) in a fiddle.

The tabular part

I put it apart, 1. because it was not part of the question, 2. because it's way more complex, and 3. because it's my first C# program ever, so it lacks polishing, conciseness, best practice, and so on.

CSHARP
// Lookup for a known field, either at the start of a line, or after a field separator of at least 2 spaces. var headerMatch = Regex.Match(strContent, "(?:^|.* )Billing address(?: .*|$)", RegexOptions.Multiline); // Split the line to get the individual fields. var fieldsMatches = Regex.Matches(headerMatch.Value+" ", "([^ ](?:[^ ]+| [^ ]+)*) +", RegexOptions.Multiline); var fieldNames = fieldsMatches.Select(m => m.Value.Trim()).ToArray(); var fieldPos = fieldsMatches.Select(m => m.Index).ToArray(); var fieldLengths = fieldsMatches.Select(m => m.Length).ToArray(); // Get the lines following the header line, until an empty line or the end of the block. var dataLines = Regex.Match(strContent.Substring(headerMatch.Index + headerMatch.Length), "(?:\n.+)*"); // For each line, loop to isolate individual fields. var fieldVals = new Dictionary<string, string>(); foreach(var fieldName in fieldNames) fieldVals.Add(fieldName, ""); foreach(Match line in Regex.Matches(dataLines.Value, ".+")) { var fieldNum = 0; var ls = line.Value; foreach(var fieldName in fieldNames) { var pos = fieldPos[fieldNum]; var length = fieldLengths[fieldNum]; string fragment = ls.Length <= pos ? "" : ls.Substring(pos, pos + length > ls.Length ? ls.Length - pos : length); fragment = fragment.TrimEnd(); // For multiline field values, separate each segment from the previous with a newline, // except if it starts with a space in which case it is just the wrapped tail of the previous (à la LDIF). var concatenator = fieldVals[fieldName].Length > 0 && fragment.Length > 0 && fragment.Substring(0, 1) != " " ? "\n" : ""; fieldVals[fieldName] += concatenator+fragment; ++fieldNum; } } foreach(var field in fieldVals) Console.WriteLine(field.Key+": "+field.Value.Replace("\n", " <newline> "));

No comments yet.

Answer by VoidDiscoverer502 1 month ago

0
REGEX
\[Name\](?<Name>[\s\S]*?)\[Age\](?<Age>[\s\S]*?)\[MobileNumber\](?<MN>[\s\S]*?)Billing address(?<BA>[\s\S]*?)Delivery address(?<DA>(?:\s*.*))

See this enter link description here

No comments yet.

Discussion

No comments yet.