Conversation
|
@oxzi Would you like to do the first round of reviews here? |
oxzi
left a comment
There was a problem hiding this comment.
In general and when excluding the open TODOs: LGTM, thanks!
I have spent a bit of a time testing this PR in different situations and it worked as expected. Btw, I have pushed a rebased version to the fake-downtime-end-event-review-oxzi branch, also including an initial draft for an integration test, 0414c2a.
I would say: Let's continue with this PR by also populating the other tables.
@yhabteab: If you currently do not have the time and want to hand this PR over, I am free to continue here.
Thanks again, looks good so far!
I'm working on a completely different thing right now, so you can of course take over this :). |
b37318e to
0fd36fe
Compare
|
@lippserd: If you want to, feel free to start reviewing this PR now. |
0fd36fe to
2ca94db
Compare
|
Added a check against deleted objects which aren't even in the database. This happened during multiple bigger redeployments in my test setup. |
2ca94db to
7e9be66
Compare
Al2Klimov
left a comment
There was a problem hiding this comment.
Indeed. If I add and remove a comment/downtime, I get two additional events in the history tab.
Except if I stop Redis between addition and removal and restart Icinga 2 afterwards to clear eventually buffered history events. Or if I rm /var/lib/icinga2/api/packages/_api/4b444b3c-7063-4318-93f2-8b41cb6ba1ec/conf.d/*/*.conf and restart Icinga. Or if I simply use curl -fvku iw2:iw2 -X DELETE -H 'Accept: application/json' https://127.0.0.1:5665/v1/objects/comments. Then I only get the addition events, even with snapshot packages.
👎The runtime behavior doesn't even change with this PR. See curl command above.
👍In my other scenarios I indeed get removal events for both types.
This is actually a loophole in Icinga 2. When directly deleting comment objects and not using the However, the change is being propagated as a runtime update, where Icinga DB acts upon. While this is definitely something to look into, this is not really what this PR tries to address. The core are disappearing or vanishing objects, such as after a Director deployment. |
7e9be66 to
66d105e
Compare
this is not really what this PR tries to address. The core are disappearing or vanishing objects, such as after a Director deployment.
The history retention is currently based on the final or closing timestamp, if such a timestamp exists. This change switches to the initial timestamp, which always exists. As reported in the Community Forum[^0] and in an ongoing PR[^1], there might be (historical) history entries without a closing timestamp. Thus, these entries will never be rotated. Furthermore, based on the documentation, it might be expected that the initial timestamp is used for retention, not a later one. This change does not cover the SLA retention, to ensure no misleading SLA reports are being generated when a dangerously short SLA retention period was defined. [^0]: https://community.icinga.com/t/comment-history-and-retention-on-a-busy-system/15318/3 [^1]: #913
The history retention is currently based on the final or closing timestamp, if such a timestamp exists. This change switches to the initial timestamp, which always exists. As reported in the Community Forum[^0] and in an ongoing PR[^1], there might be (historical) history entries without a closing timestamp. Thus, these entries will never be rotated. Furthermore, based on the documentation, it might be expected that the initial timestamp is used for retention, not a later one. This change does not cover the SLA retention, to ensure no misleading SLA reports are being generated when a dangerously short SLA retention period was defined. [^0]: https://community.icinga.com/t/comment-history-and-retention-on-a-busy-system/15318/3 [^1]: #913
The history retention is currently based on the final or closing timestamp, if such a timestamp exists. This change also takes the initial timestamp into account as a fallback. As reported in the Community Forum[^0] and in an ongoing PR[^1], there might be (historical) history entries without a closing timestamp. Thus, these entries will never be rotated. Furthermore, based on the documentation, it might be expected that the initial timestamp is used for retention, not a later one. This change does not cover the SLA retention, to ensure no misleading SLA reports are being generated when a dangerously short SLA retention period was defined. [^0]: https://community.icinga.com/t/comment-history-and-retention-on-a-busy-system/15318/3 [^1]: #913
66d105e to
9ef9a27
Compare
It is possible for objects to vanish in Icinga 2, e.g., when a new config pack is deployed as the Director does. Icinga DB already handles this within its synchronization logic, explicitly deleting removed objects. However, these deleted objects might have open logical dependencies, such as entries in the history tables. For example, only deleting a Downtime object is not enough, as both its downtime_history and sla_history_downtime entries needs to be updated as well. In this example, these updates are required to not show the Downtimes anymore, have a valid SLA calculation, and ensure a working downtime history retention. As Icinga DB does not receive the change from Icinga 2 over the usual "icinga:history:stream:downtime" Redis stream, it must be reproduced or mocked by Icinga DB itself. Now, Icinga DB is able to close the history for both vanished Comment and Downtime objects itself. Co-Authored-By: Yonas Habteab <yonas.habteab@icinga.com>
Ensure that removed Comment and Downtime objects attached to Hosts, vanishing after a config pack redeployment, are correctly closed by Icinga DB.
9ef9a27 to
0808b72
Compare
tests: Simulate vanishing Comments and Downtimes
Ensure that removed Comment and Downtime objects attached to Hosts,
vanishing after a config pack redeployment, are correctly closed by
Icinga DB.
icingadb: Manually close deleted Comments and Downtimes
It is possible for objects to vanish in Icinga 2, e.g., when a new
config pack is deployed as the Director does. Icinga DB already handles
this within its synchronization logic, explicitly deleting removed
objects.
However, these deleted objects might have open logical dependencies,
such as entries in the history tables. For example, only deleting
a Downtime object is not enough, as both its downtime_history and
sla_history_downtime entries needs to be updated as well.
In this example, these updates are required to not show the Downtimes
anymore, have a valid SLA calculation, and ensure a working downtime
history retention.
As Icinga DB does not receive the change from Icinga 2 over the usual
"icinga:history:stream:downtime" Redis stream, it must be reproduced or
mocked by Icinga DB itself. Now, Icinga DB is able to close the history
for both vanished Comment and Downtime objects itself.
Co-Authored-By: Yonas Habteab yonas.habteab@icinga.com
tests: Bump icinga-testing for Config Pack support
Icinga/icinga-testing#34
Fixes #910.