|
118 | 118 | "output_type": "stream",
|
119 | 119 | "text": [
|
120 | 120 | "Cluster 0:\n",
|
121 |
| - "[('Heading_accuracy > 58.5 and Free_kick_accuracy > 56.0 and Agility <= 81.5', (0.93548387096774188, 0.8529411764705882, 10))]\n", |
| 121 | + "[('Agility <= 81.5 and Free_kick_accuracy > 56.0 and Heading_accuracy > 58.5', (0.93548387096774188, 0.8529411764705882, 10))]\n", |
122 | 122 | "Cluster 1:\n",
|
123 |
| - "[('Agility > 81.5 and Aggression <= 76.5 and Heading_accuracy <= 84.0', (1.0, 0.77419354838709675, 2))]\n", |
| 123 | + "[('Aggression <= 76.5 and Agility > 81.5 and Balance > 66.5', (1.0, 0.77419354838709675, 8))]\n", |
124 | 124 | "Cluster 2:\n",
|
125 |
| - "[('Heading_accuracy > 82.5 and Curve <= 61.5', (1.0, 0.7857142857142857, 8)), ('Heading_accuracy > 82.5 and Free_kick_accuracy <= 56.5', (1.0, 0.7857142857142857, 2))]\n", |
| 125 | + "[('Curve <= 61.5 and Heading_accuracy > 82.5', (1.0, 0.7857142857142857, 8))]\n", |
126 | 126 | "Cluster 3:\n",
|
127 |
| - "[('GK_reflexes > 59.0', (1.0, 1.0, 2))]\n" |
| 127 | + "[('Curve <= 28.0', (1.0, 1.0, 4))]\n" |
128 | 128 | ]
|
129 | 129 | }
|
130 | 130 | ],
|
|
139 | 139 | " X_train = data.drop(['Name', 'Preferred_Positions', 'cluster'], axis=1)\n",
|
140 | 140 | " y_train = (data['cluster']==i_cluster)*1\n",
|
141 | 141 | " skope_rules_clf = SkopeRules(feature_names=feature_names, random_state=42, n_estimators=5,\n",
|
142 |
| - " recall_min=0.5, precision_min=0.5,\n", |
| 142 | + " recall_min=0.5, precision_min=0.5, max_depth_duplication=0,\n", |
143 | 143 | " max_samples=1., max_depth=3)\n",
|
144 | 144 | " skope_rules_clf.fit(X_train, y_train)\n",
|
145 | 145 | " print('Cluster '+str(i_cluster)+':')\n",
|
|
153 | 153 | "source": [
|
154 | 154 | "In cluster 0, we find players with **good heading and free kick accuracy**, but which are **not the best agile players** (<= 81/100). This rule is a good description of cluster 0: it captures 85% of cluster 1, with a precision of 93% (7% of players described by the rule are not in cluster 0). The third term of the performance term (10) is the number of time that this rule was extracted from the trees built during skope-rules' fitting.\n",
|
155 | 155 | "\n",
|
156 |
| - "In cluster 1, we find **very agile players** which are **not** the **most aggressive** and **accurate with their heads**. This rule is very precise but misses 23% of this cluster.\n", |
| 156 | + "In cluster 1, we find **very agile players** which are **not** the **most aggressive** but they are **balanced**. This rule is very precise but misses 23% of this cluster.\n", |
157 | 157 | "\n",
|
158 |
| - "In cluster 2, we find players **accurate with their heads** but with **less skills for dribbling** (Curve). Note that an alternative rule is proposed here, with similar performances.\n", |
| 158 | + "In cluster 2, we find players **accurate with their heads** but with **less skills for dribbling** (Curve).\n", |
159 | 159 | "\n",
|
160 |
| - "In cluster 3, we find players who have **good goal-keeper reflexes**. This rule perfectly defines the cluster (100% precision, 100% recall). This is the goal-keeper cluster." |
| 160 | + "In cluster 3, we find players who are very bad at **driblling**. This rule perfectly defines the cluster (100% precision, 100% recall). This is the goal-keeper cluster." |
161 | 161 | ]
|
162 | 162 | },
|
163 | 163 | {
|
|
0 commit comments