Skip to content

Conversation

@BlazZupan
Copy link
Contributor

@BlazZupan BlazZupan commented May 23, 2025

Issue

Silhouette Plot would display silhouette score per group, but not the average silhouette score which may serve as a reference to judge if the group is below or above the average.

image

Also, learning about average silhouette score may be beneficial when judging on the quality of some clustering that precedes Silhouette Plot widget, say, a Hierarchical Clustering or DBSCAN and Silhouette Plot combination.

Description of changes

This pull request introduces an info box that reports on the average silhouette score:

image

The style of reporting (the label in bold and the number in normal text) follows that of an Info widget. When input data is missing, the box displays N/A.

Unit tests should most likely be added. I have currently tested the widget by coupling it in the workflow with k-means, that also reports on the average silhouette:

image
Includes
  • Code changes
  • Tests
  • Documentation

@BlazZupan BlazZupan changed the title OWSilhouettePlot: displays average silhouette [Enh] OWSilhouettePlot: displays average silhouette May 23, 2025
@BlazZupan BlazZupan changed the title [Enh] OWSilhouettePlot: displays average silhouette [ENH] OWSilhouettePlot: displays average silhouette May 23, 2025
@codecov
Copy link

codecov bot commented May 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.80%. Comparing base (872e4ae) to head (f18a786).
⚠️ Report is 49 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7092   +/-   ##
=======================================
  Coverage   88.80%   88.80%           
=======================================
  Files         335      335           
  Lines       73916    73923    +7     
=======================================
+ Hits        65638    65647    +9     
+ Misses       8278     8276    -2     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@janezd
Copy link
Contributor

janezd commented May 23, 2025

Hm, what about putting this on the graph, say at the top left, in two lines?

Silhouette scores for groups are already there. The second reason is that if it is a part of the figure, it is saved together with the figure.

self.avg_silhouette_label.setText(
f"<b>Silhouette:</b> {avg_score:.4f}")
else:
self.avg_silhouette_label.setText("<b>Silhouette:</b> N/A")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must it be bold? I don't think we use bold anywhere else (or at least not often).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used bold in Data Info widget, I copied the style from there. But I agree it sticks out, and will correct. Let me work on this in the following days.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bold removed

self.Warning.nan_distances(
count_nandist, s="s" if count_nandist > 1 else "")

self._update_avg_silhouette() # Update the average silhouette display
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment necessary?

I know: copilot and similar tools add heaps and heaps of comments to explain the code, but they often just state the obvious. PEP8 says they're distracting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@ajdapretnar
Copy link
Contributor

I was working on this in parallel with Blaž. Here is my proposed solution: #7106

@BlazZupan BlazZupan force-pushed the display-average-silhouette branch from 52453dc to 48a2bd2 Compare October 10, 2025 09:03
@BlazZupan
Copy link
Contributor Author

The score is now displayed in the Info box at the top of the control area, using the same style as in the Data Table.

image

data = Table("brown-selected")
self.send_signal(self.widget.Inputs.data, data)
self.assertEqual(self.widget.avg_silhouette_label.text(),
"Average Silhouette: 0.4692")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us hope this is stable...

@janezd
Copy link
Contributor

janezd commented Oct 12, 2025

@BlazZupan, for some reason, github doesn't allow me to push new commits to your branch. Could you, please, update the translation file? Here are the changes (be careful with indentation).

             Ignoring categorical features: Ignoriram kategorične spremenljivke
         def `__init__`:
+            Info: Podatki
             Distance: Razdalja
             distance_idx: false
             Grouping: Skupine
@@ -15095,6 +15096,9 @@ widgets/visualize/owsilhouetteplot.py:
             Input matrix does not have associated data: Matrika razdalj ne vsebuje podatkov.
         def `_ensure_matrix`:
             invalid state: false
+        def `_update_avg_silhouette`:
+            'Average Silhouette: {avg_score:.4f}': 'Povprečna silhueta: {avg_score:.4f}'
+            'Average Silhouette: N/A': 'Povprečna silhueta:'
         def `_update`:

@janezd janezd self-assigned this Oct 24, 2025
@janezd janezd closed this Oct 26, 2025
@janezd janezd deleted the display-average-silhouette branch October 26, 2025 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants