blob: 9381c7f7e4c5ced5973dc58da379591b8fe17019 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
|
*******************
Text_LanguageDetect
*******************
PHP library to identify human languages from text samples.
Returns confidence scores for each.
Installation
============
PEAR
----
::
$ pear install Text_LanguageDetect
Composer
--------
::
$ composer require pear/text_languagedetect
Usage
=====
Also see the examples in the ``docs/`` directory and
the `official documentation`__.
__ http://pear.php.net/package/Text_LanguageDetect/docs
Language detection
------------------
Simple language detection::
<?php
require_once 'Text/LanguageDetect.php';
$text = 'Was wäre, wenn ich Ihnen das jetzt sagen würde?';
$ld = new Text_LanguageDetect();
$language = $ld->detectSimple($text);
echo $language;
//output: german
Show the three most probable languages with their confidence score::
<?php
require_once 'Text/LanguageDetect.php';
$text = 'Was wäre, wenn ich Ihnen das jetzt sagen würde?';
$ld = new Text_LanguageDetect();
//3 most probable languages
$results = $ld->detect($text, 3);
foreach ($results as $language => $confidence) {
echo $language . ': ' . number_format($confidence, 2) . "\n";
}
//output:
//german: 0.35
//dutch: 0.25
//swedish: 0.20
?>
Language code
-------------
Instead of returning the full language name, ISO 639-2 two and three
letter codes can be returned::
<?php
require_once 'Text/LanguageDetect.php';
$ld = new Text_LanguageDetect();
//will output the ISO 639-1 two-letter language code
// "de"
$ld->setNameMode(2);
echo $ld->detectSimple('Das ist ein kleiner Text') . "\n";
//will output the ISO 639-2 three-letter language code
// "deu"
$ld->setNameMode(3);
echo $ld->detectSimple('Das ist ein kleiner Text') . "\n";
?>
Supported languages
===================
- albanian
- arabic
- azeri
- bengali
- bulgarian
- cebuano
- croatian
- czech
- danish
- dutch
- english
- estonian
- farsi
- finnish
- french
- german
- hausa
- hawaiian
- hindi
- hungarian
- icelandic
- indonesian
- italian
- kazakh
- kyrgyz
- latin
- latvian
- lithuanian
- macedonian
- mongolian
- nepali
- norwegian
- pashto
- pidgin
- polish
- portuguese
- romanian
- russian
- serbian
- slovak
- slovene
- somali
- spanish
- swahili
- swedish
- tagalog
- turkish
- ukrainian
- urdu
- uzbek
- vietnamese
- welsh
Links
=====
Homepage
http://pear.php.net/package/Text_LanguageDetect
Bug tracker
http://pear.php.net/bugs/search.php?cmd=display&package_name[]=Text_LanguageDetect
Documentation
http://pear.php.net/package/Text_LanguageDetect/docs
Unit test status
https://travis-ci.org/pear/Text_LanguageDetect
.. image:: https://travis-ci.org/pear/Text_LanguageDetect.svg?branch=master
:target: https://travis-ci.org/pear/Text_LanguageDetect
|