目錄
lookaround是甚麼
lookaround是一種不佔位置的對比斷言
(zero width assertions),但它並不對比特定字元,而是根據條件來對比。分為4種
它也可以多層、混在一起使用
,例如/(A(?=B))(?=C)/、/A(?=B(?=C))/
- lookahead + positive: T(?=C) → [
目標
]後方的條件要符合「條件」 - lookbehind + positive: (?<=C)T → [
目標
]前方的條件要符合「條件」 - lookahead + negative: T(?!C) → [
目標
]後方的條件不能符合「條件」 - lookbehind + negative: (?<!C)T → [
目標
]前方的條件不能符合「條件」
對比特定字元的邊界對比:^、$、\b
positive + lookahead
const insult = 'Adam is such a asshole!';
const praise = 'Adam is such a genius!';
const aheadPositive = /such\s(?=a\sasshole)/g; // [such]的後面符合「a asshole」,就把such換成not
insult.replace(aheadPositive, 'not ');
praise.replace(aheadPositive, 'not ');
positive + lookbehind
const insult = 'Xi steamed bun, you son of bitch. Suck my dick';
const para = 'Our steamed bun is on sale now! Come to buy 3 and get 1 for free!';
const ad = 'Does steamed bun your favorite? Come to Mother Zai cooking school. We can teach you make delicious steamed bun~';
const behindPositive = /(?<=Xi\s)steamed bun/g; // [steamed bun]的前面符合「Xi」,就把steamed bun換成**
insult.replace(behindPositive, '**');
para.replace(behindPositive, '**');
ad.replace(behindPositive, '**');
negative + lookahead
const Emma = 'E221930121';
const Allen = 'E129377814';
const Joe = 'A121916907';
const isFemale = (id) => {
const aheadNegative = /[A-Z](?![1]{1}[0-9]{8})/; // [英文]後面不符合「1開頭,共有9個數字」
return aheadNegative.test(id);
}
isFemale(Emma); // true
isFemale(Allen); // false
isFemale(Joe); // false
negative + lookbehind
const Emma = 'E221930121';
const Allen = 'E129377814';
const Joe = 'A121916907';
const isNotFromKaouhsiung = (id) => {
const behindNegative = /(?<![E])[1-2]{1}[0-9]{8}/; // [1開頭,共有9個數字]前面不符合「E」
return behindNegative.test(id);
}
isNotFromKaouhsiung(Emma); // false
isNotFromKaouhsiung(Allen); // false
isNotFromKaouhsiung(Joe); // true
進階
(2023/11/3更新)
問題 & 資訊
以下有幾個valid的網址,我要擷取網址最多到create或者edit
的部分
- /xxx
- /xxx/aaa
- /xxx/action
- /xxx/aaa/action
- /yyy-yyy/bbb/action
- /xxx/action/iii123
- /xxx/action/123iii
- /xxx/bbb/action/iii123
- /zzz/ccc/action/iii123/iii123
從以上網址可以歸納出幾個規則
- 指出頁面分類的最小單位是「1個/,加上xxx、yyy、yyy-yyy、zzz、aaa、bbb、ccc」
- 必定放在開頭
- 有1~多組
- action只有/create和/edit
- 必定放在分類的後面
- 有0~1組
- /id的id是英(大小寫)、數字混合
- 必定在最後面
- 有0~多組
- id的前方一定有action,action的後方不見得有id
解法
可以拆成幾步思考
- 從1可以看出頁面分類是 /(xxx|yyy-yyy|yyy|zzz|aaa|bbb|ccc),加上1-2變成 (/(xxx|yyy|yyy-yyy|zzz|aaa|bbb|ccc))+
- 從2可以看出頁面分類是 /(create|edit),加上2-2變成 (/(create|edit))?
從3可以看出頁面分類是 /[a-zA-Z0-9]+,加上3-2變成 (/[a-zA-Z0-9]+)*
const pageType = `(/(xxx|yyy|yyy-yyy|zzz|aaa|bbb|ccc))+`; const action = `(/(create|edit|create-multiple))?`; const id = `(/[a-zA-Z0-9]+)*`;
從4+2-1、2-2可以看出頁面分類和action的關係是「頁面分類後面是action,或者沒有東西」,所以變成 (/(xxx|yyy|yyy-yyy|zzz|aaa|bbb|ccc))+(?=(/(create|edit))?)
- 從4可以看出action和id的關係是「action後面是id,或者沒有東西」,所以變成 ((/(create|edit))?(?=(/[a-zA-Z0-9]+)*))
- 把前面兩步得到的正規表達式合起來,變成((/(xxx|yyy|yyy-yyy|zzz|aaa|bbb|ccc))+(?=((/(create|edit))?)(?=((/[a-zA-Z0-9]+)*))))
// 把route都整理在這邊
enum Routes {
XXX = 'xxx',
YYY = 'yyy',
YYY_YYY = 'yyy-yyy',
ZZZ = 'zzz',
AAA = 'aaa',
BBB = 'bbb',
CCC = 'ccc'
}
const paths = [
'/xxx',
'/xxx/aaa',
'/xxx/action',
'/xxx/aaa/action',
'/yyy-yyy/bbb/action',
'/xxx/action/iii123',
'/xxx/action/123iii',
'/xxx/bbb/action/iii123',
'/zzz/ccc/action/iii123/iii123',
];
// YYY_YYY一定要放YYY前面,不然會直接被YYY攔住
const pageType = `(/(${Routes.XXX}|${Routes.YYY_YYY}|${Routes.YYY}|${Routes.ZZZ}|${Routes.AAA}|${Routes.BBB}|${Routes.CCC}))+`;
const action = `/(create|edit)`;
const id = `(/[a-zA-Z0-9]+)*`;
// 為了讓複雜的正規表達式好讀些,所以拆三段
const entireRegex = new RegExp(`${pageType}(${action})?(?=(${id})*)`);
console.log(paths.map((p) => p.match(entireRegex)?.[0]));
// ["/xxx", "/xxx/aaa", "/xxx/create", "/yyy/aaa/create", "/yyy-yyy/bbb/create", "/xxx/create", "/xxx/edit", "/xxx/bbb/edit", "/zzz/ccc/create"]