最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

javascript - How to chech Bosnian-specific characters in RegEx? - Stack Overflow

matteradmin4PV0评论

I have this Regular Expression pattern, which is quite simple and it validates if the provided string is "alpha" (both uppercase and lowercase):

var pattern = /^[a-zA-Z]+$/gi;

When I trigger pattern.test('Zlatan Omerovic') it returns true, however if I:

pattern.test('Zlatan Omerović');

It returns false and it fails my validation.

In Bosnian language we have these specific characters:

š đ č ć ž

And uppercased:

Š Đ Č Ć Ž

Is it possible to validate these characters (both cases) with JavaScript regular expression?

I have this Regular Expression pattern, which is quite simple and it validates if the provided string is "alpha" (both uppercase and lowercase):

var pattern = /^[a-zA-Z]+$/gi;

When I trigger pattern.test('Zlatan Omerovic') it returns true, however if I:

pattern.test('Zlatan Omerović');

It returns false and it fails my validation.

In Bosnian language we have these specific characters:

š đ č ć ž

And uppercased:

Š Đ Č Ć Ž

Is it possible to validate these characters (both cases) with JavaScript regular expression?

Share Improve this question asked Apr 12, 2013 at 22:05 user1386320user1386320 8
  • Yes, what have you tried? Protip: just add those between the square brackets. – Fabrício Matté Commented Apr 12, 2013 at 22:06
  • @FabrícioMatté - excatly that, what you see in the question :) – user1386320 Commented Apr 12, 2013 at 22:07
  • I meant, it looks like you just copypasta'd some regex that validates alphabetical characters but ok. If you look into the meaning of those square brackets - a character class - you'd know how to fix such regex. – Fabrício Matté Commented Apr 12, 2013 at 22:09
  • @FabrícioMatté: The character class a-z could well enpass š to a Bosnian. It doesn't in JavaScript, but that doesn't make it illogical from a non-English perspective. – T.J. Crowder Commented Apr 12, 2013 at 22:13
  • @T.J.Crowder I believe JS's character classes' ranges are ASCII code based, no? In that case a-z represents characters 97-122 (and 65-90 with the case-insensitive flag) only. Or these are UTF-8 based, not sure. – Fabrício Matté Commented Apr 12, 2013 at 22:15
 |  Show 3 more ments

3 Answers 3

Reset to default 9

Sure, you can just add those characters to the list of characters your matching. Also, since you're doing a case insensitive match (the i flag), you don't need the uppercase characters.

var pattern = /^[a-zšđčćž ]+$/gi;

Fiddle here: http://jsfiddle/ryanbrill/KB74b/

Here's an alternate pattern, which uses the unicode representation, which might be better (embedding the characters won't work if the file isn't saved with the proper encoding, for instance)

var pattern = /^[a-z\u0161\u0111\u010D\u0107\u017E ]+$/gi;

http://jsfiddle/ryanbrill/KB74b/2/

a-zA-Z means exactly that, and in an English-centric way: abcdefghijklmnopqrstuvwxyz. Sadly, with JavaScript's regular expressions, if you want to test other alphabetic characters, you have to specify them specifically. JavaScript doesn't have a locale-sensitive "alpha" definition. To include non-English alphabetic characters, you have to include them on purpose. You can either do that literally (for instance, by including š in the regular expression), or using Unicode escape sequences (such as \u0161). If the additional Bosnian alphabetic characters in question have a contiguous range, you can use the - notation with them as well, but it has to be separate from the a-z, which is defined in English terms.

To include in test result the first (S-based) symbol of your five I did:

var pattern = /^[a-zA-Z\u0160-\u0161]+$/g;

Try to add all the symbols you need this way ;)

Post a comment

comment list (0)

  1. No comments so far