{"id":45410,"date":"2023-03-22T14:03:26","date_gmt":"2023-08-17T07:37:48","guid":{"rendered":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/"},"modified":"2024-04-29T20:03:11","modified_gmt":"2024-04-29T12:03:11","slug":"45410-2","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/","title":{"rendered":""},"content":{"rendered":"<h1>\u6982\u8981<\/h1>\n<p>Rust\u88fd\u306e\u5f62\u614b\u7d20\u89e3\u6790\u5668Lindera\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306b\u542b\u307e\u308c\u308b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u6570\u3048\u308b\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u4f5c\u6210\u3057\u307e\u3057\u305f\u3002Lindera\u306e\u4f7f\u7528\u65b9\u6cd5\u3068\u3001\u4e26\u5217\u51e6\u7406\u30e9\u30a4\u30d6\u30e9\u30earayon\u306b\u3088\u308b\u51e6\u7406\u306e\u9ad8\u901f\u5316\u306b\u3064\u3044\u3066\u89e3\u8aac\u3057\u307e\u3059\u3002<\/p>\n<h1>Rust\u306b\u3064\u3044\u3066<\/h1>\n<p>Rust\u306f\u3001C\u3084C++\u3068\u540c\u7b49\u306e\u9ad8\u901f\u30fb\u4f4e\u30ec\u30a4\u30e4\u30fc\u306e\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u304c\u53ef\u80fd\u3067\u3042\u308a\u306a\u304c\u3089\u3001\u30e1\u30e2\u30ea\u5b89\u5168\u6027\u306b\u91cd\u70b9\u3092\u304a\u3044\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u30022015\u5e74\u306b\u30d0\u30fc\u30b8\u30e7\u30f31.0\u304c\u30ea\u30ea\u30fc\u30b9\u3055\u308c\u305f\u6bd4\u8f03\u7684\u65b0\u3057\u3044\u8a00\u8a9e\u3067\u3042\u308a\u3001\u8fd1\u5e74\u4eba\u6c17\u3092\u96c6\u3081\u3066\u3044\u307e\u3059\u3002<\/p>\n<h1>\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306bRust\u3092\u4f7f\u3046\u52d5\u6a5f<\/h1>\n<p>Rust\u3092\u52c9\u5f37\u4e2d\u3067\u3042\u308b\u8457\u8005\u304c\u300c\u4f7f\u3063\u3066\u307f\u305f\u304b\u3063\u305f\u300d\u3068\u3044\u3046\u306e\u304c\u4e00\u756a\u306e\u52d5\u6a5f\u3067\u3059\u3002<br \/>\n\u305d\u308c\u4ee5\u5916\u306b\u306f\u3001<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">Rust\u306b\u3088\u308b\u65e5\u672c\u8a9e\u306e\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306e\u4f8b\u306f\u3001\u30cd\u30c3\u30c8\u4e0a\u3067\u305d\u308c\u307b\u3069\u305f\u304f\u3055\u3093\u306f\u898b\u304b\u3051\u306a\u3044\u306e\u3067\u3001\u3084\u3063\u3066\u307f\u305f\u3002<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\u305d\u308c\u306a\u308a\u306e\u5927\u304d\u3055\u306e\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\uff08\u6570\u767e\u30e1\u30ac\u30d0\u30a4\u30c8\uff09\u3092\u5206\u304b\u3061\u66f8\u304d\u3059\u308b\u306e\u306b\u3001\u4f8b\u3048\u3070Python\u3067\u51e6\u7406\u3059\u308b\u306e\u3068\u6bd4\u3079\u3066\u3001\u901f\u5ea6\u9762\u3067Rust\u3067\u51e6\u7406\u3059\u308b\u30e1\u30ea\u30c3\u30c8\u304c\u3042\u308b\u304b\u78ba\u304b\u3081\u305f\u3044\u3002<\/ul>\n<p>\u3068\u3044\u3046\u70b9\u3082\u3001Rust\u3092\u4f7f\u3063\u305f\u7406\u7531\u3067\u3059\u3002<\/p>\n<h1>Rust\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u65b9\u6cd5\u3068\u30d0\u30fc\u30b8\u30e7\u30f3<\/h1>\n<p>Rust\u306e\u30b3\u30f3\u30d1\u30a4\u30e9\u3084cargo\uff08Rust\u306e\u30d3\u30eb\u30c9\u30b7\u30b9\u30c6\u30e0\u517c\u30d1\u30c3\u30b1\u30fc\u30b8\u30de\u30cd\u30fc\u30b8\u30e3\uff09\u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3059\u308b\u306e\u306b\u3001\u516c\u5f0f\u306b\u63a8\u5968\u3055\u308c\u3066\u3044\u308b\u306e\u306frustup\u3068\u3044\u3046\u30c4\u30fc\u30eb\u30c1\u30a7\u30fc\u30f3\u7ba1\u7406\u30c4\u30fc\u30eb\u3092\u4f7f\u3046\u65b9\u6cd5\u3067\u3059\u3002<\/p>\n<pre class=\"post-pre\"><code><span class=\"nv\">$ <\/span>rustup update\r\n<\/code><\/pre>\n<p>rustup\u306f\u4e0a\u8a18\u306e\u3088\u3046\u306a\u30b3\u30de\u30f3\u30c9\u3067\u3001Rust\u3068\u305d\u306e\u5468\u8fba\u30c4\u30fc\u30eb\u3092\u6700\u65b0\u306b\u30a2\u30c3\u30d7\u30c7\u30fc\u30c8\u3057\u3066\u304f\u308c\u307e\u3059\u3002\u516c\u5f0f\u30b5\u30a4\u30c8\u3067\u4e00\u756a\u306b\u7d39\u4ecb\u3055\u308c\u3066\u3044\u308b\u3053\u3068\u304b\u3089\u3082\u304a\u52e7\u3081\u306e\u65b9\u6cd5\u3067\u3042\u308a\u3001\u79c1\u3082\u3053\u308c\u307e\u3067\u305a\u3063\u3068\u4f7f\u7528\u3057\u3066\u3044\u308b\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u65b9\u6cd5\u3067\u3059\u3002<\/p>\n<p>\u3057\u304b\u3057\u3001\u4eca\u56de\u3053\u306e\u8a18\u4e8b\u3092\u66f8\u304f\u306b\u3042\u305f\u3063\u3066\u8abf\u3079\u3066\u76f4\u3057\u305f\u3068\u3053\u308d\u3001homebrew\u3092\u4f7f\u7528\u3057\u3066\u3044\u308b\u5834\u5408\u306f<\/p>\n<pre class=\"post-pre\"><code><span class=\"nv\">$ <\/span>brew <span class=\"nb\">install <\/span>rust\r\n<\/code><\/pre>\n<p>\u3067\u30c4\u30fc\u30eb\u30c1\u30a7\u30fc\u30f3\u4e00\u5f0f\u304c\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3055\u308c\u308b\u3068\u306e\u8a18\u8ff0\u3092\u898b\u3064\u3051\u307e\u3057\u305f\uff08\u53c2\u8003\u30b5\u30a4\u30c8\uff09\u3002homebrew\u3067\u30a2\u30c3\u30d7\u30c7\u30fc\u30c8\u3092\u7ba1\u7406\u3057\u305f\u3044\u306a\u3089\u3053\u3061\u3089\u306e\u65b9\u6cd5\u3067\u826f\u3044\u304b\u3082\u3057\u308c\u307e\u305b\u3093\uff08\u79c1\u306f\u8a66\u3057\u3066\u3044\u307e\u305b\u3093\uff09\u3002<\/p>\n<p>\u307e\u305f\u3001\u4eca\u56de\u4f7f\u7528\u3057\u305fRust\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u306f\u4ee5\u4e0b\u306e\u901a\u308a\u3067\u3059\u3002<\/p>\n<pre class=\"post-pre\"><code><span class=\"nv\">$ <\/span>rustc <span class=\"nt\">-V<\/span>\r\nrustc 1.48.0 <span class=\"o\">(<\/span>7eac88abb 2020-11-16<span class=\"o\">)<\/span>\r\n<\/code><\/pre>\n<p>Lindera\u306e\u30b5\u30a4\u30c8\u306b\u8a18\u8f09\u3055\u308c\u3066\u3044\u308b\u901a\u308a\u3001Lindera 0.7.1 \u306f Rust 1.46.0 \u4ee5\u4e0a\u3092\u8981\u6c42\u3057\u307e\u3059\uff082020\/12\/21 \u73fe\u5728\uff09\u306e\u3067\u3001\u305d\u308c\u4ee5\u4e0a\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u3092\u4f7f\u7528\u3057\u3066\u4e0b\u3055\u3044\u3002<\/p>\n<h1>\u5b9f\u88c5<\/h1>\n<p>\u4eca\u56de\u4f5c\u6210\u3057\u305f\u30d7\u30ed\u30b0\u30e9\u30e0\u4e00\u5f0f\u306f\u3053\u3061\u3089\u3067\u3059\u3002<\/p>\n<h2>Lindera\u306b\u3088\u308b\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0<\/h2>\n<p>Cargo.toml \u306b<\/p>\n<pre class=\"post-pre\"><code><span class=\"nn\">[dependencies]<\/span>\r\n<span class=\"py\">lindera-core<\/span> <span class=\"p\">=<\/span> <span class=\"s\">\"0.7.1\"<\/span>\r\n<span class=\"py\">lindera<\/span> <span class=\"p\">=<\/span> <span class=\"s\">\"0.7.1\"<\/span>\r\n<\/code><\/pre>\n<p>\u3068\u4f9d\u5b58\u3059\u308b\u30af\u30ec\u30fc\u30c8\u3092\u8a18\u8ff0\u3057\u3001Lindera\u306e\u516c\u5f0f\u30b5\u30a4\u30c8\u306e\u4f8b\u3092\u53c2\u8003\u306b\u3001<\/p>\n<pre class=\"post-pre\"><code><span class=\"k\">use<\/span> <span class=\"nn\">lindera<\/span><span class=\"p\">::<\/span><span class=\"nn\">tokenizer<\/span><span class=\"p\">::<\/span><span class=\"n\">Tokenizer<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">lindera_core<\/span><span class=\"p\">::<\/span><span class=\"nn\">core<\/span><span class=\"p\">::<\/span><span class=\"nn\">viterbi<\/span><span class=\"p\">::<\/span><span class=\"n\">Mode<\/span><span class=\"p\">;<\/span>\r\n\r\n<span class=\"k\">fn<\/span> <span class=\"nf\">main<\/span><span class=\"p\">()<\/span> <span class=\"p\">{<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"k\">mut<\/span> <span class=\"n\">tokenizer<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">Tokenizer<\/span><span class=\"p\">::<\/span><span class=\"nf\">new<\/span><span class=\"p\">(<\/span><span class=\"nn\">Mode<\/span><span class=\"p\">::<\/span><span class=\"n\">Normal<\/span><span class=\"p\">,<\/span> <span class=\"s\">\"\"<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">tokens<\/span> <span class=\"o\">=<\/span> <span class=\"n\">tokenizer<\/span><span class=\"nf\">.tokenize<\/span><span class=\"p\">(<\/span><span class=\"s\">\"Rust\u306f\u96e3\u3057\u3044\u304c\u3001\u9762\u767d\u3044\u3002\"<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"k\">for<\/span> <span class=\"n\">token<\/span> <span class=\"n\">in<\/span> <span class=\"n\">tokens<\/span> <span class=\"p\">{<\/span>\r\n        <span class=\"nd\">println!<\/span><span class=\"p\">(<\/span><span class=\"s\">\"{}<\/span><span class=\"se\">\\t<\/span><span class=\"s\">{:?}\"<\/span><span class=\"p\">,<\/span> <span class=\"n\">token<\/span><span class=\"py\">.text<\/span><span class=\"p\">,<\/span> <span class=\"n\">token<\/span><span class=\"py\">.detail<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"p\">}<\/span>\r\n<span class=\"p\">}<\/span>\r\n<\/code><\/pre>\n<p>\u306e\u3088\u3046\u306a\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3092 examples\/lindera_example.rs \u306b\u8a18\u8ff0\u3057\u307e\u3059\u3002<br \/>\n\u3053\u308c\u3092 cargo run &#8211;example lindera_example \u3068\u3057\u3066\u5b9f\u884c\u3059\u308b\u3068\u3001\u4ee5\u4e0b\u306e\u3088\u3046\u306a\u7d50\u679c\u304c\u5f97\u3089\u308c\u307e\u3059\u3002<\/p>\n<pre class=\"post-pre\"><code>Rust    [\"UNK\"]\r\n\u306f      [\"\u52a9\u8a5e\", \"\u4fc2\u52a9\u8a5e\", \"*\", \"*\", \"*\", \"*\", \"\u306f\", \"\u30cf\", \"\u30ef\"]\r\n\u96e3\u3057\u3044  [\"\u5f62\u5bb9\u8a5e\", \"\u81ea\u7acb\", \"*\", \"*\", \"\u5f62\u5bb9\u8a5e\u30fb\u30a4\u6bb5\", \"\u57fa\u672c\u5f62\", \"\u96e3\u3057\u3044\", \"\u30e0\u30ba\u30ab\u30b7\u30a4\", \"\u30e0\u30ba\u30ab\u30b7\u30a4\"]\r\n\u304c      [\"\u52a9\u8a5e\", \"\u63a5\u7d9a\u52a9\u8a5e\", \"*\", \"*\", \"*\", \"*\", \"\u304c\", \"\u30ac\", \"\u30ac\"]\r\n\u3001      [\"\u8a18\u53f7\", \"\u8aad\u70b9\", \"*\", \"*\", \"*\", \"*\", \"\u3001\", \"\u3001\", \"\u3001\"]\r\n\u9762\u767d\u3044  [\"\u5f62\u5bb9\u8a5e\", \"\u81ea\u7acb\", \"*\", \"*\", \"\u5f62\u5bb9\u8a5e\u30fb\u30a2\u30a6\u30aa\u6bb5\", \"\u57fa\u672c\u5f62\", \"\u9762\u767d\u3044\", \"\u30aa\u30e2\u30b7\u30ed\u30a4\", \"\u30aa\u30e2\u30b7\u30ed\u30a4\"]\r\n\u3002      [\"\u8a18\u53f7\", \"\u53e5\u70b9\", \"*\", \"*\", \"*\", \"*\", \"\u3002\", \"\u3002\", \"\u3002\"]\r\n<\/code><\/pre>\n<p>tokenize() \u30e1\u30bd\u30c3\u30c9\u3092\u547c\u3093\u3067\u8fd4\u3055\u308c\u305fToken\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e text \u306b\u5206\u304b\u3061\u66f8\u304d\u306e\u7d50\u679c\u304c\u3001 detail \u306b\u8aad\u307f\u3084\u54c1\u8a5e\u7b49\u306e\u60c5\u5831\u304c\u683c\u7d0d\u3055\u308c\u3066\u3044\u308b\u306e\u304c\u5206\u304b\u308a\u307e\u3059\u3002<br \/>\n\u3053\u306e token \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u6570\u3092\u6570\u3048\u308b\u3053\u3068\u3067\u3001\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3067\u304d\u307e\u3059\u3002<\/p>\n<h2>\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u6570\u3048\u308b\u51e6\u7406<\/h2>\n<p>\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306b\u542b\u307e\u308c\u308b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u51e6\u7406\u3092\u5b9f\u88c5\u3057\u307e\u3059\u3002<\/p>\n<pre class=\"post-pre\"><code><span class=\"k\">use<\/span> <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"n\">env<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"n\">fs<\/span><span class=\"p\">;<\/span>\r\n\r\n<span class=\"k\">use<\/span> <span class=\"nn\">lindera<\/span><span class=\"p\">::<\/span><span class=\"nn\">tokenizer<\/span><span class=\"p\">::<\/span><span class=\"n\">Tokenizer<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">lindera_core<\/span><span class=\"p\">::<\/span><span class=\"nn\">core<\/span><span class=\"p\">::<\/span><span class=\"nn\">viterbi<\/span><span class=\"p\">::<\/span><span class=\"n\">Mode<\/span><span class=\"p\">;<\/span>\r\n\r\n<span class=\"k\">fn<\/span> <span class=\"nf\">count_words<\/span><span class=\"p\">(<\/span><span class=\"n\">tokenizer<\/span><span class=\"p\">:<\/span> <span class=\"o\">&amp;<\/span><span class=\"k\">mut<\/span> <span class=\"n\">Tokenizer<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span><span class=\"p\">:<\/span> <span class=\"o\">&amp;<\/span><span class=\"nb\">str<\/span><span class=\"p\">)<\/span> <span class=\"k\">-&gt;<\/span> <span class=\"nb\">usize<\/span> <span class=\"p\">{<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">tokens<\/span> <span class=\"o\">=<\/span> <span class=\"n\">tokenizer<\/span><span class=\"nf\">.tokenize<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"n\">tokens<\/span><span class=\"nf\">.len<\/span><span class=\"p\">()<\/span>\r\n<span class=\"p\">}<\/span>\r\n\r\n<span class=\"k\">fn<\/span> <span class=\"nf\">main<\/span><span class=\"p\">()<\/span> <span class=\"p\">{<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">args<\/span><span class=\"p\">:<\/span> <span class=\"nb\">Vec<\/span><span class=\"o\">&lt;<\/span><span class=\"nb\">String<\/span><span class=\"o\">&gt;<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">env<\/span><span class=\"p\">::<\/span><span class=\"nf\">args<\/span><span class=\"p\">()<\/span><span class=\"nf\">.collect<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">filename<\/span> <span class=\"o\">=<\/span> <span class=\"n\">args<\/span><span class=\"nf\">.get<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span><span class=\"nf\">.unwrap_or_else<\/span><span class=\"p\">(||<\/span> <span class=\"p\">{<\/span>\r\n        <span class=\"nd\">println!<\/span><span class=\"p\">(<\/span><span class=\"s\">\"Please give the input file.\"<\/span><span class=\"p\">);<\/span>\r\n        <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"nn\">process<\/span><span class=\"p\">::<\/span><span class=\"nf\">exit<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"p\">});<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"k\">mut<\/span> <span class=\"n\">tokenizer<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">Tokenizer<\/span><span class=\"p\">::<\/span><span class=\"nf\">new<\/span><span class=\"p\">(<\/span><span class=\"nn\">Mode<\/span><span class=\"p\">::<\/span><span class=\"n\">Normal<\/span><span class=\"p\">,<\/span> <span class=\"s\">\"\"<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">contents<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">fs<\/span><span class=\"p\">::<\/span><span class=\"nf\">read_to_string<\/span><span class=\"p\">(<\/span><span class=\"n\">filename<\/span><span class=\"p\">)<\/span><span class=\"nf\">.unwrap<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">count<\/span><span class=\"p\">:<\/span> <span class=\"nb\">usize<\/span> <span class=\"o\">=<\/span> <span class=\"n\">contents<\/span>\r\n        <span class=\"nf\">.lines<\/span><span class=\"p\">()<\/span>\r\n        <span class=\"nf\">.map<\/span><span class=\"p\">(|<\/span><span class=\"n\">line<\/span><span class=\"p\">|<\/span> <span class=\"nf\">count_words<\/span><span class=\"p\">(<\/span><span class=\"o\">&amp;<\/span><span class=\"k\">mut<\/span> <span class=\"n\">tokenizer<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span><span class=\"p\">),<\/span>\r\n        <span class=\"p\">)<\/span>\r\n        <span class=\"nf\">.sum<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"nd\">println!<\/span><span class=\"p\">(<\/span><span class=\"s\">\" {} {}\"<\/span><span class=\"p\">,<\/span> <span class=\"n\">count<\/span><span class=\"p\">,<\/span> <span class=\"n\">filename<\/span><span class=\"p\">);<\/span>\r\n<span class=\"p\">}<\/span>\r\n<\/code><\/pre>\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u306f\u3001<br \/>\n1. \u5f15\u6570\u3067\u4e0e\u3048\u3089\u308c\u305f\u30d5\u30a1\u30a4\u30eb\u540d\u306e\u30d5\u30a1\u30a4\u30eb\u3092\u30aa\u30fc\u30d7\u30f3\u3057\u3001\u30e1\u30e2\u30ea\u306b\u30ed\u30fc\u30c9<br \/>\n2. Lindera\u306eTokenizer\u3092\u521d\u671f\u5316<br \/>\n3. \u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u3092\u6539\u884c\u3067\u533a\u5207\u308a\u3001\u4e00\u884c\u305a\u3064Tokernizer\u306b\u4e0e\u3048\u3066\u5206\u304b\u3061\u66f8\u304d\u3059\u308b<br \/>\n4. \u5206\u304b\u3061\u66f8\u304d\u3055\u308c\u305f\u8a9e\u6570\u3092\u5408\u8a08\u3057\u3066\u8868\u793a<br \/>\n\u3068\u3044\u3046\u51e6\u7406\u3092\u884c\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u3053\u308c\u3092cargo\u3067 cargo build &#8211;release \u3068\u30d3\u30eb\u30c9\u3057\u3066\u5b9f\u884c\u3059\u308b\u3068\u3001<\/p>\n<pre class=\"post-pre\"><code><span class=\"nv\">$ <\/span>.\/target\/release\/ja-word-count \/Users\/username\/100MB.txt\r\n  26256732 \/Users\/username\/100MB.txt\r\n<\/code><\/pre>\n<p>\u306e\u3088\u3046\u306b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u304c\u8868\u793a\u3055\u308c\u307e\u3059\u3002<\/p>\n<p>100MB\u306e\u65e5\u672c\u8a9e\u306e\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u3092\u51e6\u7406\u3059\u308b\u306e\u306b\u3001\u624b\u5143\u306e\u30de\u30b7\u30f3\uff08MacBook Pro 3.5 GHz \u30c7\u30e5\u30a2\u30eb\u30b3\u30a2Intel Core i7\uff09\u306760\u79d2\u304b\u308970\u79d2\u307b\u3069\u304b\u304b\u308a\u307e\u3057\u305f\u3002<\/p>\n<h2>rayon\u306b\u3088\u308b\u4e26\u5217\u5316<\/h2>\n<p>\u51e6\u7406\u6642\u9593\u306e\u77ed\u7e2e\u306e\u305f\u3081\u3001\u4e0a\u8a18\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u3092rayon\u3067\u4e26\u5217\u5316\u3057\u307e\u3059\u3002rayon\u306fRust\u306e\u4e26\u5217\u51e6\u7406\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002<\/p>\n<pre class=\"post-pre\"><code><span class=\"k\">use<\/span> <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"n\">env<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"n\">fs<\/span><span class=\"p\">;<\/span>\r\n\r\n<span class=\"k\">use<\/span> <span class=\"nn\">lindera<\/span><span class=\"p\">::<\/span><span class=\"nn\">tokenizer<\/span><span class=\"p\">::<\/span><span class=\"n\">Tokenizer<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">lindera_core<\/span><span class=\"p\">::<\/span><span class=\"nn\">core<\/span><span class=\"p\">::<\/span><span class=\"nn\">viterbi<\/span><span class=\"p\">::<\/span><span class=\"n\">Mode<\/span><span class=\"p\">;<\/span>\r\n<span class=\"k\">use<\/span> <span class=\"nn\">rayon<\/span><span class=\"p\">::<\/span><span class=\"nn\">prelude<\/span><span class=\"p\">::<\/span><span class=\"o\">*<\/span><span class=\"p\">;<\/span>\r\n\r\n<span class=\"k\">fn<\/span> <span class=\"nf\">count_words<\/span><span class=\"p\">(<\/span><span class=\"n\">tokenizer<\/span><span class=\"p\">:<\/span> <span class=\"o\">&amp;<\/span><span class=\"k\">mut<\/span> <span class=\"n\">Tokenizer<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span><span class=\"p\">:<\/span> <span class=\"o\">&amp;<\/span><span class=\"nb\">str<\/span><span class=\"p\">)<\/span> <span class=\"k\">-&gt;<\/span> <span class=\"nb\">usize<\/span> <span class=\"p\">{<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">tokens<\/span> <span class=\"o\">=<\/span> <span class=\"n\">tokenizer<\/span><span class=\"nf\">.tokenize<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"n\">tokens<\/span><span class=\"nf\">.len<\/span><span class=\"p\">()<\/span>\r\n<span class=\"p\">}<\/span>\r\n\r\n<span class=\"k\">fn<\/span> <span class=\"nf\">main<\/span><span class=\"p\">()<\/span> <span class=\"p\">{<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">args<\/span><span class=\"p\">:<\/span> <span class=\"nb\">Vec<\/span><span class=\"o\">&lt;<\/span><span class=\"nb\">String<\/span><span class=\"o\">&gt;<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">env<\/span><span class=\"p\">::<\/span><span class=\"nf\">args<\/span><span class=\"p\">()<\/span><span class=\"nf\">.collect<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">filename<\/span> <span class=\"o\">=<\/span> <span class=\"n\">args<\/span><span class=\"nf\">.get<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span><span class=\"nf\">.unwrap_or_else<\/span><span class=\"p\">(||<\/span> <span class=\"p\">{<\/span>\r\n        <span class=\"nd\">println!<\/span><span class=\"p\">(<\/span><span class=\"s\">\"Please give the input file.\"<\/span><span class=\"p\">);<\/span>\r\n        <span class=\"nn\">std<\/span><span class=\"p\">::<\/span><span class=\"nn\">process<\/span><span class=\"p\">::<\/span><span class=\"nf\">exit<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">);<\/span>\r\n    <span class=\"p\">});<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">contents<\/span> <span class=\"o\">=<\/span> <span class=\"nn\">fs<\/span><span class=\"p\">::<\/span><span class=\"nf\">read_to_string<\/span><span class=\"p\">(<\/span><span class=\"n\">filename<\/span><span class=\"p\">)<\/span><span class=\"nf\">.unwrap<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"k\">let<\/span> <span class=\"n\">count<\/span><span class=\"p\">:<\/span> <span class=\"nb\">usize<\/span> <span class=\"o\">=<\/span> <span class=\"n\">contents<\/span>\r\n        <span class=\"nf\">.par_lines<\/span><span class=\"p\">()<\/span>\r\n        <span class=\"nf\">.map_init<\/span><span class=\"p\">(<\/span>\r\n            <span class=\"p\">||<\/span> <span class=\"nn\">Tokenizer<\/span><span class=\"p\">::<\/span><span class=\"nf\">new<\/span><span class=\"p\">(<\/span><span class=\"nn\">Mode<\/span><span class=\"p\">::<\/span><span class=\"n\">Normal<\/span><span class=\"p\">,<\/span> <span class=\"s\">\"\"<\/span><span class=\"p\">),<\/span>\r\n            <span class=\"p\">|<\/span><span class=\"n\">tokenizer<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span><span class=\"p\">|<\/span> <span class=\"nf\">count_words<\/span><span class=\"p\">(<\/span><span class=\"n\">tokenizer<\/span><span class=\"p\">,<\/span> <span class=\"n\">line<\/span><span class=\"p\">),<\/span>\r\n        <span class=\"p\">)<\/span>\r\n        <span class=\"nf\">.sum<\/span><span class=\"p\">();<\/span>\r\n    <span class=\"nd\">println!<\/span><span class=\"p\">(<\/span><span class=\"s\">\" {} {}\"<\/span><span class=\"p\">,<\/span> <span class=\"n\">count<\/span><span class=\"p\">,<\/span> <span class=\"n\">filename<\/span><span class=\"p\">);<\/span>\r\n<span class=\"p\">}<\/span>\r\n<\/code><\/pre>\n<p>\u4e26\u5217\u5316\u306b\u3042\u305f\u308a\u5909\u66f4\u3057\u305f\u70b9\u306f\u3001 rayon\u306e par_lines \u3068\u3044\u3046\u30e1\u30bd\u30c3\u30c9\u3067\u30c6\u30ad\u30b9\u30c8\u306e\u5404\u884c\u3092\u4e26\u5217\u51e6\u7406\u304c\u53ef\u80fd\u306a ParallelIterator \u306b\u5909\u63db\u3057\u3066\u3044\u308b\u3053\u3068\u3068\u3001\u5404\u4e26\u5217\u51e6\u7406\u3067\u4e00\u56de\u3060\u3051\u5b9f\u884c\u3059\u308c\u3070\u3088\u3044Tokenizer\u306e\u521d\u671f\u5316\u51e6\u7406\u306e\u305f\u3081\u306b\u3001\u540c\u3058\u304frayon\u306e map_init \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3044\u308b\u3053\u3068\u3067\u3059\u3002<\/p>\n<p>\u540c\u3058100MB\u306e\u65e5\u672c\u8a9e\u306e\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u3092\u51e6\u7406\u3059\u308b\u306e\u306b\u304b\u304b\u3063\u305f\u6642\u9593\u306f\u300127\u79d2\u3067\u3057\u305f\u3002\u4e26\u5217\u5316\u306e\u52b9\u679c\u3068\u3057\u3066\u306f\u307e\u305a\u307e\u305a\u3068\u3044\u3063\u305f\u3068\u3053\u308d\u3067\u3057\u3087\u3046\u304b\u3002<\/p>\n<h1>Python\u3068\u306e\u6bd4\u8f03<\/h1>\n<p>\u304a\u307e\u3051\u3068\u3057\u3066\u3001Python\u3067\u5b9f\u88c5\u3057\u305f\u985e\u4f3c\u306e\u51e6\u7406\uff08\u3053\u3061\u3089\uff09\u3068\u306e\u51e6\u7406\u6642\u9593\u306e\u6bd4\u8f03\u3092\u884c\u3044\u307e\u3057\u305f\u3002<\/p>\n<p>\u3053\u3061\u3089\u306e main.py \u3067\u4e0a\u8a18\u3068\u540c\u3058100MB\u306e\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u3092\u3001\u540c\u3058\u30de\u30b7\u30f3\u3067\u51e6\u7406\u3057\u305f\u3068\u3053\u308d\u7d0440\u5206\u304b\u304b\u308a\u307e\u3057\u305f\u3002\u305f\u3060\u3001\u3044\u308d\u3044\u308d\u306a\u610f\u5473\u3067\u30d5\u30a7\u30a2\u306a\u6bd4\u8f03\u3067\u306f\u3042\u308a\u307e\u305b\u3093\u3002\u4f8b\u3048\u3070\u3001Python\u306e\u65b9\u3067\u306f\u9ad8\u901f\u5316\u306e\u52aa\u529b\u3092\u4e00\u5207\u3057\u3066\u304a\u308a\u307e\u305b\u3093\u3057\u3001\u305d\u3082\u3082\u305d\u3082\u5f62\u614b\u7d20\u89e3\u6790\u5668\u304c\u7570\u306a\u308b\u305f\u3081\u8a9e\u6570\u306e\u30ab\u30a6\u30f3\u30c8\u7d50\u679c\u3082\u4e00\u81f4\u3057\u307e\u305b\u3093\u3002\u3042\u304f\u307e\u3067\u53c2\u8003\u7a0b\u5ea6\u306b\u7559\u3081\u3066\u4e0b\u3055\u3044\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u6982\u8981 Rust\u88fd\u306e\u5f62\u614b\u7d20\u89e3\u6790\u5668Lindera\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306b\u542b\u307e\u308c\u308b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u6570\u3048\u308b\u30d7\u30ed\u30b0\u30e9 [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-45410","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>- Blog - Silicon Cloud<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:description\" content=\"\u6982\u8981 Rust\u88fd\u306e\u5f62\u614b\u7d20\u89e3\u6790\u5668Lindera\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306b\u542b\u307e\u308c\u308b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u6570\u3048\u308b\u30d7\u30ed\u30b0\u30e9 [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-17T07:37:48+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-29T12:03:11+00:00\" \/>\n<meta name=\"author\" content=\"\u6587, \u7fd4\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"\u6587, \u7fd4\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/\",\"name\":\"- Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\"},\"datePublished\":\"2023-08-17T07:37:48+00:00\",\"dateModified\":\"2024-04-29T12:03:11+00:00\",\"author\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/64d5cc7727fffbff2f9a2a8da1de3e5c\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/\"]}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/\",\"name\":\"Blog - Silicon Cloud\",\"description\":\"\",\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/64d5cc7727fffbff2f9a2a8da1de3e5c\",\"name\":\"\u6587, \u7fd4\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/920c3d673e0bccacc98e5e6b7149bb3c22edd8d39cb753e5d7d7e471498118a1?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/920c3d673e0bccacc98e5e6b7149bb3c22edd8d39cb753e5d7d7e471498118a1?s=96&d=mm&r=g\",\"caption\":\"\u6587, \u7fd4\"},\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/author\/wenxiang\/\"},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/#local-main-organization-logo\",\"url\":\"\",\"contentUrl\":\"\",\"caption\":\"Blog - Silicon Cloud\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"- Blog - Silicon Cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/","og_locale":"zh_CN","og_type":"article","og_description":"\u6982\u8981 Rust\u88fd\u306e\u5f62\u614b\u7d20\u89e3\u6790\u5668Lindera\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306b\u542b\u307e\u308c\u308b\u65e5\u672c\u8a9e\u306e\u8a9e\u6570\u3092\u6570\u3048\u308b\u30d7\u30ed\u30b0\u30e9 [&hellip;]","og_url":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/","og_site_name":"Blog - Silicon Cloud","article_published_time":"2023-08-17T07:37:48+00:00","article_modified_time":"2024-04-29T12:03:11+00:00","author":"\u6587, \u7fd4","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"\u6587, \u7fd4","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"2 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/","url":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/","name":"- Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website"},"datePublished":"2023-08-17T07:37:48+00:00","dateModified":"2024-04-29T12:03:11+00:00","author":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/64d5cc7727fffbff2f9a2a8da1de3e5c"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/"]}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website","url":"https:\/\/www.silicloud.com\/zh\/blog\/","name":"Blog - Silicon Cloud","description":"","inLanguage":"zh-Hans"},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/64d5cc7727fffbff2f9a2a8da1de3e5c","name":"\u6587, \u7fd4","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/920c3d673e0bccacc98e5e6b7149bb3c22edd8d39cb753e5d7d7e471498118a1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/920c3d673e0bccacc98e5e6b7149bb3c22edd8d39cb753e5d7d7e471498118a1?s=96&d=mm&r=g","caption":"\u6587, \u7fd4"},"url":"https:\/\/www.silicloud.com\/zh\/blog\/author\/wenxiang\/"},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/45410-2\/#local-main-organization-logo","url":"","contentUrl":"","caption":"Blog - Silicon Cloud"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/45410","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/comments?post=45410"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/45410\/revisions"}],"predecessor-version":[{"id":87352,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/45410\/revisions\/87352"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/media?parent=45410"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/categories?post=45410"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/tags?post=45410"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}