Austin Group Bug Tracker
2014-08-27 16:16:20 UTC
The following issue has been SUBMITTED.
======================================================================
http://austingroupbugs.net/view.php?id=872
======================================================================
Reported By: nsz
Assigned To:
======================================================================
Project: 1003.1(2013)/Issue7+TC1
Issue ID: 872
Category: Base Definitions and Headers
Type: Clarification Requested
Severity: Editorial
Priority: normal
Status: New
Name: Szabolcs Nagy
Organization: musl libc
User Reference:
Section: 9.3.5 RE Bracket Expression
Page Number: -
Line Number: -
Interp Status: ---
Final Accepted Text:
======================================================================
Date Submitted: 2014-08-27 16:16 UTC
Last Modified: 2014-08-27 16:16 UTC
======================================================================
Summary: REG_ICASE regex matching and negated bracket expr
Description:
In chapter 9 the case insensitive matching of negated (^) bracket
expressions is inconsistent with historical practice.
(1) Case insensitive matching according to section 9.2 "Regular
Expression General Requirements":
"when each character in the string is matched against the pattern, not
only the character, but also its case counterpart (if any), shall be
matched."
(2) Rule 3. in 9.3.5 "RE Bracket Expression":
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating
element except for the expressions represented in the list after the
leading <circumflex>."
these two rules together mean that [^a] should match 'a' and 'A' with
REG_ICASE, because using (1) both 'a' and 'A' should be tried when
matching either of them against the bracket expr and 'A' does match [^a]
according to (2).
on historical implementations [^a] does not match 'a' nor 'A' with
REG_ICASE
Desired Action:
change
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating element
except for the expressions represented in the list after the leading
<circumflex>."
to
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating element
except for the ones that match the expressions represented in the list
after the leading <circumflex>. Matching the expressions in the list is
done without regard to the case when the regular expression is matched
case-insensitively."
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2014-08-27 16:16 nsz New Issue
2014-08-27 16:16 nsz Name => Szabolcs Nagy
2014-08-27 16:16 nsz Organization => musl libc
2014-08-27 16:16 nsz Section => 9.3.5 RE Bracket
Expression
2014-08-27 16:16 nsz Page Number => -
2014-08-27 16:16 nsz Line Number => -
======================================================================
======================================================================
http://austingroupbugs.net/view.php?id=872
======================================================================
Reported By: nsz
Assigned To:
======================================================================
Project: 1003.1(2013)/Issue7+TC1
Issue ID: 872
Category: Base Definitions and Headers
Type: Clarification Requested
Severity: Editorial
Priority: normal
Status: New
Name: Szabolcs Nagy
Organization: musl libc
User Reference:
Section: 9.3.5 RE Bracket Expression
Page Number: -
Line Number: -
Interp Status: ---
Final Accepted Text:
======================================================================
Date Submitted: 2014-08-27 16:16 UTC
Last Modified: 2014-08-27 16:16 UTC
======================================================================
Summary: REG_ICASE regex matching and negated bracket expr
Description:
In chapter 9 the case insensitive matching of negated (^) bracket
expressions is inconsistent with historical practice.
(1) Case insensitive matching according to section 9.2 "Regular
Expression General Requirements":
"when each character in the string is matched against the pattern, not
only the character, but also its case counterpart (if any), shall be
matched."
(2) Rule 3. in 9.3.5 "RE Bracket Expression":
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating
element except for the expressions represented in the list after the
leading <circumflex>."
these two rules together mean that [^a] should match 'a' and 'A' with
REG_ICASE, because using (1) both 'a' and 'A' should be tried when
matching either of them against the bracket expr and 'A' does match [^a]
according to (2).
on historical implementations [^a] does not match 'a' nor 'A' with
REG_ICASE
Desired Action:
change
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating element
except for the expressions represented in the list after the leading
<circumflex>."
to
"A non-matching list expression begins with a <circumflex> ( '^' ), and
specifies a list that shall match any single-character collating element
except for the ones that match the expressions represented in the list
after the leading <circumflex>. Matching the expressions in the list is
done without regard to the case when the regular expression is matched
case-insensitively."
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2014-08-27 16:16 nsz New Issue
2014-08-27 16:16 nsz Name => Szabolcs Nagy
2014-08-27 16:16 nsz Organization => musl libc
2014-08-27 16:16 nsz Section => 9.3.5 RE Bracket
Expression
2014-08-27 16:16 nsz Page Number => -
2014-08-27 16:16 nsz Line Number => -
======================================================================