alsotang

alsotang

A fullstack JS programmer.

Member Since 10 years ago

Tencent, ShenZhen, China

Experience Points
5.1k
follower
Lessons Completed
76
follow
Lessons Completed
1.2k
stars
Best Reply Awards
192
repos

345 contributions in the last year

Pinned
⚡ :baby_chick:Nodeclub 是使用 Node.js 和 MongoDB 开发的社区系统
⚡ :closed_book:《Node.js 包教不包会》 by alsotang
⚡ :blue_book:async.js 各种函数的 demo
⚡ :heart_eyes: Writing Fast JavaScript
⚡ Determine if a string is all Chinese(based on unicode range)
⚡ Generate arbitrary size file on Cloudflare Workers
Activity
Aug
6
2 months ago
started
started time in 2 months ago
Aug
4
2 months ago
started
started time in 2 months ago
started
started time in 2 months ago
started
started time in 2 months ago
Aug
3
2 months ago
Activity icon
issue

alsotang issue comment alsotang/is-chinese

alsotang
alsotang

reduce range of unicode

before:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,562,495 ops/sec ±0.56% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,004,072 ops/sec ±2.48% (87 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 29,389,279 ops/sec ±2.31% (88 runs sampled)
isChinese(chars1000) true x 332,923 ops/sec ±1.06% (89 runs sampled)
isChinese(chars1000WithS) false x 1,690,598 ops/sec ±1.35% (86 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")

after:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,947,991 ops/sec ±0.45% (90 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 42,548,217 ops/sec ±1.18% (89 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 30,718,944 ops/sec ±0.48% (90 runs sampled)
isChinese(chars1000) true x 375,754 ops/sec ±0.47% (90 runs sampled)
isChinese(chars1000WithS) false x 1,746,008 ops/sec ±0.89% (88 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")
alsotang
alsotang

Maybe we should just trust the regex engine? I don't think the changes are necessary

started
started time in 2 months ago
started
started time in 2 months ago
Aug
2
2 months ago
Activity icon
issue

alsotang issue comment alsotang/is-chinese

alsotang
alsotang

reduce range of unicode

before:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,562,495 ops/sec ±0.56% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,004,072 ops/sec ±2.48% (87 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 29,389,279 ops/sec ±2.31% (88 runs sampled)
isChinese(chars1000) true x 332,923 ops/sec ±1.06% (89 runs sampled)
isChinese(chars1000WithS) false x 1,690,598 ops/sec ±1.35% (86 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")

after:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,947,991 ops/sec ±0.45% (90 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 42,548,217 ops/sec ±1.18% (89 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 30,718,944 ops/sec ±0.48% (90 runs sampled)
isChinese(chars1000) true x 375,754 ops/sec ±0.47% (90 runs sampled)
isChinese(chars1000WithS) false x 1,746,008 ops/sec ±0.89% (88 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")
alsotang
alsotang

These ranges would be compiled to a regex. Maybe regex combines the range internally, so we don't need to do it manually?

Activity icon
issue

alsotang issue comment alsotang/is-chinese

alsotang
alsotang

reduce range of unicode

before:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,562,495 ops/sec ±0.56% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,004,072 ops/sec ±2.48% (87 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 29,389,279 ops/sec ±2.31% (88 runs sampled)
isChinese(chars1000) true x 332,923 ops/sec ±1.06% (89 runs sampled)
isChinese(chars1000WithS) false x 1,690,598 ops/sec ±1.35% (86 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")

after:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,947,991 ops/sec ±0.45% (90 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 42,548,217 ops/sec ±1.18% (89 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 30,718,944 ops/sec ±0.48% (90 runs sampled)
isChinese(chars1000) true x 375,754 ops/sec ±0.47% (90 runs sampled)
isChinese(chars1000WithS) false x 1,746,008 ops/sec ±0.89% (88 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")
alsotang
alsotang

I checked out this commit and compare the benchmark with master

$ node -v
v16.4.0
node benchmark.js
isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 10,532,167 ops/sec ±0.84% (87 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,371,203 ops/sec ±0.74% (91 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 28,886,915 ops/sec ±0.98% (88 runs sampled)
isChinese(chars1000) true x 317,638 ops/sec ±0.89% (87 runs sampled)
isChinese(chars1000WithS) false x 1,611,493 ops/sec ±0.99% (88 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 10,549,005 ops/sec ±1.08% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 38,652,878 ops/sec ±5.66% (84 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 28,347,738 ops/sec ±2.94% (87 runs sampled)
isChinese(chars1000) true x 299,741 ops/sec ±1.77% (83 runs sampled)
isChinese(chars1000WithS) false x 1,571,945 ops/sec ±1.43% (83 runs sampled)

There is no improvement

Activity icon
issue

alsotang issue comment alsotang/is-chinese

alsotang
alsotang

reduce range of unicode

before:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,562,495 ops/sec ±0.56% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,004,072 ops/sec ±2.48% (87 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 29,389,279 ops/sec ±2.31% (88 runs sampled)
isChinese(chars1000) true x 332,923 ops/sec ±1.06% (89 runs sampled)
isChinese(chars1000WithS) false x 1,690,598 ops/sec ±1.35% (86 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")

after:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,947,991 ops/sec ±0.45% (90 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 42,548,217 ops/sec ±1.18% (89 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 30,718,944 ops/sec ±0.48% (90 runs sampled)
isChinese(chars1000) true x 375,754 ops/sec ±0.47% (90 runs sampled)
isChinese(chars1000WithS) false x 1,746,008 ops/sec ±0.89% (88 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")
alsotang
alsotang

This modify is good. The overhead only happens when library initial, so it's ok. Just assure the combined result is correct and I will merge this.

Activity icon
issue

alsotang issue comment alsotang/is_chinese_rs

alsotang
alsotang

perf:  🚀 speed up with simd

alsotang
alsotang

You said there are some bugs in your solution, but the test cases couldn't reflect them. Can you help to add test cases?

started
started time in 2 months ago
Activity icon
issue

alsotang issue comment alsotang/is_chinese_rs

alsotang
alsotang

perf:  🚀 speed up with simd

alsotang
alsotang

You can create a simd branch in this repo and write it in readme. But I won't merge this into main until simd is a stable feature. Is this ok?

Activity icon
issue

alsotang issue comment alsotang/is_chinese_rs

alsotang
alsotang

perf:  🚀 speed up with simd

alsotang
alsotang

This optimization can not run on stable rust??

   Compiling packed_simd_2 v0.3.5
error[E0554]: `#![feature]` may not be used on the stable release channel
   --> /Users/atang/.cargo/registry/src/github.com-1ecc6299db9ec823/packed_simd_2-0.3.5/src/lib.rs:214:1
    |
214 | / #![feature(
215 | |     const_generics,
216 | |     repr_simd,
217 | |     rustc_attrs,
...   |
227 | |     llvm_asm
228 | | )]
    | |__^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0554`.
error: could not compile `packed_simd_2`
Activity icon
issue

alsotang issue comment alsotang/is-chinese

alsotang
alsotang

reduce range of unicode

before:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,562,495 ops/sec ±0.56% (88 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 41,004,072 ops/sec ±2.48% (87 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 29,389,279 ops/sec ±2.31% (88 runs sampled)
isChinese(chars1000) true x 332,923 ops/sec ±1.06% (89 runs sampled)
isChinese(chars1000WithS) false x 1,690,598 ops/sec ±1.35% (86 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")

after:

isChinese("扁担宽,板凳长,扁担想绑在板凳上。") x 11,947,991 ops/sec ±0.45% (90 runs sampled)
isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。") x 42,548,217 ops/sec ±1.18% (89 runs sampled)
isChinese("扁担宽,板凳长,扁担想绑在板凳上。ss") x 30,718,944 ops/sec ±0.48% (90 runs sampled)
isChinese(chars1000) true x 375,754 ops/sec ±0.47% (90 runs sampled)
isChinese(chars1000WithS) false x 1,746,008 ops/sec ±0.89% (88 runs sampled)
Fastest is isChinese("ss扁担宽,板凳长,扁担想绑在板凳上。")
alsotang
alsotang

The data structure looks asymmetric and this optimization is hard-coded. While the performance optimization of the proxy is pretty obvious, could it be a little more graceful?

Activity icon
issue

alsotang issue alsotang/is_chinese_rs

alsotang
alsotang

duplicate range

[0xf900, 0xfaff], // CJK Compatibility Ideographs is duplicate with [0xf900, 0xfaff], // https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs

Activity icon
issue

alsotang issue comment alsotang/is_chinese_rs

alsotang
alsotang

duplicate range

[0xf900, 0xfaff], // CJK Compatibility Ideographs is duplicate with [0xf900, 0xfaff], // https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs

alsotang
alsotang

fixed by c51829061de2f36f9014ea0ff688afae2400bc72

Activity icon
issue

alsotang issue comment alsotang/is_chinese_rs

alsotang
alsotang

duplicate range

[0xf900, 0xfaff], // CJK Compatibility Ideographs is duplicate with [0xf900, 0xfaff], // https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs

alsotang
alsotang
Jul
30
2 months ago
started
started time in 2 months ago
Jul
25
2 months ago
Jul
24
3 months ago
Activity icon
created branch
createdAt 2 months ago
Activity icon
created repository
createdAt 2 months ago
pull request

alsotang pull request ajmwagar/merino

alsotang
alsotang

clone the reference instend of whole vector

Activity icon
created branch

alsotang in alsotang/merino create branch use_arc

createdAt 2 months ago
Activity icon
fork

alsotang forked ajmwagar/merino

⚡ :sheep: A SOCKS5 Proxy server written in Rust
alsotang MIT License Updated
fork time in 2 months ago