Commit 0a47c2e
Patch non-repeative sampling "inv_pop_f" (#300)
using non-repeative sampling instead of the original repeative sampling
when using `"candi_sel_prob": "inv_pop_f"`.
`random.choices` -> `numpy.random.choice(replace=False)`
The original behavior could take a large portion of repeated long-tail
low-frequency smaples (the longer the tail, the worse the case), causing
tens of percents of repeated downstream fp calculations, moreover
amplifying the noise in labels from these high-force configurations.
The non-repeated sampling re-nomalizes the prob after screening out each
picked sample
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved candidate selection process to ensure unique selections when
limiting the number of candidates, preventing duplicates in the output.
* **Tests**
* Updated tests to reflect changes in the candidate selection method and
to ensure correct probability handling and uniqueness of selected
candidates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>1 parent 8aaa7ca commit 0a47c2e
File tree
2 files changed
+20
-16
lines changed- dpgen2/exploration/report
- tests/exploration
2 files changed
+20
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
446 | 446 | | |
447 | 447 | | |
448 | 448 | | |
449 | | - | |
450 | | - | |
451 | | - | |
452 | | - | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
453 | 454 | | |
| 455 | + | |
454 | 456 | | |
455 | 457 | | |
456 | 458 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
198 | 198 | | |
199 | 199 | | |
200 | 200 | | |
201 | | - | |
202 | | - | |
203 | | - | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
204 | 205 | | |
205 | 206 | | |
206 | 207 | | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
211 | 213 | | |
212 | 214 | | |
213 | 215 | | |
214 | | - | |
215 | | - | |
| 216 | + | |
| 217 | + | |
216 | 218 | | |
217 | | - | |
218 | | - | |
| 219 | + | |
| 220 | + | |
219 | 221 | | |
220 | 222 | | |
221 | 223 | | |
222 | 224 | | |
223 | 225 | | |
224 | | - | |
| 226 | + | |
225 | 227 | | |
226 | 228 | | |
227 | 229 | | |
| |||
0 commit comments