Kirix Support Forums

Calculated fields

Please let us know if you are encountering problems with the beta or have ideas for how it can be improved. Please provide as much detail as possible so we can try to repeat the problem. Thanks!

Calculated fields

Postby BarthaziAndras on Thu Sep 13, 2007 12:58 pm

I have these tables: http://www.nemzetisport.hu/cikk.php?cikk=153057&

In the third column there is nationality and the group together, and I would like to extract the nationality and the group.

REGEXP(Field3, "^\w+")
REGEXP(Field3, "[^ ]+$")

It's working well, but I think it's just not a great solution. I would use these formulas:

REGEXP(Field3, "^\s*(.*?)\s*,")
REGEXP(Field3, ",\s*(.*?)\s*$")

In English: if there's a grouping operator "()" in the regexp, then the regexp's result should be the content of the operator, not the whole match.

With this data source, the first solution is OK, but the second solution is not just grabbing the data, but trimming it, too, what may make an import process easier.

Andras
BarthaziAndras
Registered User
 
Posts: 9
Joined: Thu Sep 13, 2007 12:08 pm

Re: Calculated fields

Postby Aaron on Thu Sep 13, 2007 2:32 pm

Your regular expressions are correct. The only thing you're missing is more complete documentation on the REGEXP function. :)

In fact, we can extract the part of the regular expression in parenthesis by supplying an optional third parameter. Two syntaxes are supported:

REGEXP(<string>, <exp>, <integer>)
REGEXP(<string>, <exp>, <string>)

With an integer parameter, the integer denotes which sub-match included in parenthesis we want to return. For example,

REGEXP("ABC", "(A)B(C)", 1)

returns A, which is the first submatch, and

REGEXP("ABC", "(A)B(C)", 2)

returns C, which is the second submatch.

In your example, we would use the following to extract the parts in the parenthesis:

REGEXP(Field3, "^\s*(.*?)\s*,", 1)
REGEXP(Field3, ",\s*(.*?)\s*$", 1)

However, because we can denote submatches, we can even combine the two regexes into one and use the integer parameter to extract the first and second parts:

REGEXP(Field3, "^\s*(.*?)\s*,\s*(.*?)\s*$", 1)
REGEXP(Field3, "^\s*(.*?)\s*,\s*(.*?)\s*$", 2)

The string parameter works similar to the integer parameter, except that we use a $ in front of the number. For example, following the patterns above, we would use:

REGEXP(Field3, "^\s*(.*?)\s*,", "$1")
REGEXP(Field3, ",\s*(.*?)\s*$", "$1")

The beauty of this syntax is that we can easily rearrange or replace the matching portions of the string to construct new strings. For example, if we want to reverse the parts of the string separate by the comma, we can combine the expressions and use the third parameter to reorganize the matching parts:

REGEXP(Field3, "^\s*(.*?)\s*,\s*(.*?)\s*$", "$2 $1")

Finally, because we can nest functions, we can apply additional functions outside the regular expression function. For example, after reversing the parameters, we can capitalize the string using the UPPER function around the REGEXP function:

UPPER(REGEXP(Field3, "^\s*(.*?)\s*,\s*(.*?)\s*$", "$2 $1"))

Overall, regular expressions are extremely useful for extracting portions of data, and using them in calculated fields like you're using them opens up huge possibilities not only for cleaning up lists of names, addresses, phone numbers, or other contact information, but also quickly prototyping them for use in other code.
Aaron Williams
Kirix Support Team
User avatar
Aaron
Kirix Support Team
 
Posts: 120
Joined: Fri Dec 16, 2005 3:01 pm

Re: Calculated fields

Postby BarthaziAndras on Thu Sep 13, 2007 3:03 pm

Great, thanks for the detailed information. :)
BarthaziAndras
Registered User
 
Posts: 9
Joined: Thu Sep 13, 2007 12:08 pm

Return to Bug Reports & Suggestions