-
-
Notifications
You must be signed in to change notification settings - Fork 148
feat: add resource utilisation middleware to monitor CPU and memory. #1352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
c3bdb12
feat: add resource utilization middleware to monitor CPU and memory u…
vkhinvasara 4fb2da7
refactor: changed thresholds to 90%
vkhinvasara 5ce1ab9
added resource monitoring with configurable CPU and memory thresholds
vkhinvasara b58212f
fix: use blocking task for resource usage retrieval and update memory…
vkhinvasara f6a2517
fix: update memory unit logging from GiB to GB for consistency
vkhinvasara 891e881
refactor: replace RwLock with LazyLock and AtomicBool for speeeedddd
vkhinvasara f90714d
fix: add resource utilization middleware to ONLY the ingest routes
vkhinvasara cd8c029
Added resource check interval option and updated the resource monitor…
vkhinvasara 4578d0c
refactor: changed the default interval to 15 seconds.
vkhinvasara 53a4532
refactor: remove resource monitor initialization from ingest, query, …
vkhinvasara 0279f36
Merge branch 'main' into main
nikhilsinhaparseable 88b4596
refactor: add resource utilization middleware to logstream routes, de…
vkhinvasara 5b142b7
refactor: removed resource_check from the PUT stream.
vkhinvasara 5f0fa34
Merge branch 'main' into main
nikhilsinhaparseable da3583a
refactor: simplify uptime retrieval in Report::new method
vkhinvasara 305ad25
refactor: glazing clippy
vkhinvasara 1469507
empty push
vkhinvasara b083724
nit: checking if the build runs
vkhinvasara 63e7a58
version change to not use cache for this build
vkhinvasara 3262815
Merge branch 'main' into main
nikhilsinhaparseable 4f1ae5c
fix: update cache key format in build workflow
vkhinvasara File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
/* | ||
* Parseable Server (C) 2022 - 2024 Parseable, Inc. | ||
* | ||
* This program is free software: you can redistribute it and/or modify | ||
* it under the terms of the GNU Affero General Public License as | ||
* published by the Free Software Foundation, either version 3 of the | ||
* License, or (at your option) any later version. | ||
* | ||
* This program is distributed in the hope that it will be useful, | ||
* but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
* GNU Affero General Public License for more details. | ||
* | ||
* You should have received a copy of the GNU Affero General Public License | ||
* along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
* | ||
*/ | ||
|
||
use std::sync::{atomic::AtomicBool, Arc, LazyLock}; | ||
|
||
use actix_web::{ | ||
body::MessageBody, | ||
dev::{ServiceRequest, ServiceResponse}, | ||
error::Error, | ||
error::ErrorServiceUnavailable, | ||
middleware::Next, | ||
}; | ||
use tokio::{select, time::{interval, Duration}}; | ||
use tracing::{warn, trace, info}; | ||
|
||
use crate::analytics::{SYS_INFO, refresh_sys_info}; | ||
use crate::parseable::PARSEABLE; | ||
|
||
static RESOURCE_CHECK_ENABLED:LazyLock<Arc<AtomicBool>> = LazyLock::new(|| Arc::new(AtomicBool::new(false))); | ||
|
||
/// Spawn a background task to monitor system resources | ||
pub fn spawn_resource_monitor(shutdown_rx: tokio::sync::oneshot::Receiver<()>) { | ||
tokio::spawn(async move { | ||
let mut check_interval = interval(Duration::from_secs(30)); | ||
let mut shutdown_rx = shutdown_rx; | ||
|
||
let cpu_threshold = PARSEABLE.options.cpu_utilization_threshold; | ||
let memory_threshold = PARSEABLE.options.memory_utilization_threshold; | ||
|
||
info!("Resource monitor started with thresholds - CPU: {:.1}%, Memory: {:.1}%", | ||
cpu_threshold, memory_threshold); | ||
loop { | ||
select! { | ||
_ = check_interval.tick() => { | ||
trace!("Checking system resource utilization..."); | ||
|
||
refresh_sys_info(); | ||
let (used_memory, total_memory, cpu_usage) = tokio::task::spawn_blocking(|| { | ||
let sys = SYS_INFO.lock().unwrap(); | ||
let used_memory = sys.used_memory() as f32; | ||
let total_memory = sys.total_memory() as f32; | ||
let cpu_usage = sys.global_cpu_usage(); | ||
(used_memory, total_memory, cpu_usage) | ||
}).await.unwrap(); | ||
|
||
let mut resource_ok = true; | ||
|
||
// Calculate memory usage percentage | ||
let memory_usage = if total_memory > 0.0 { | ||
(used_memory / total_memory) * 100.0 | ||
} else { | ||
0.0 | ||
}; | ||
|
||
// Log current resource usage every few checks for debugging | ||
info!("Current resource usage - CPU: {:.1}%, Memory: {:.1}% ({:.1}GB/{:.1}GB)", | ||
cpu_usage, memory_usage, | ||
used_memory / 1024.0 / 1024.0 / 1024.0, | ||
total_memory / 1024.0 / 1024.0 / 1024.0); | ||
|
||
// Check memory utilization | ||
if memory_usage > memory_threshold { | ||
warn!("High memory usage detected: {:.1}% (threshold: {:.1}%)", | ||
memory_usage, memory_threshold); | ||
resource_ok = false; | ||
} | ||
|
||
// Check CPU utilization | ||
if cpu_usage > cpu_threshold { | ||
warn!("High CPU usage detected: {:.1}% (threshold: {:.1}%)", | ||
cpu_usage, cpu_threshold); | ||
resource_ok = false; | ||
} | ||
|
||
let previous_state = RESOURCE_CHECK_ENABLED.load(std::sync::atomic::Ordering::SeqCst); | ||
RESOURCE_CHECK_ENABLED.store(resource_ok, std::sync::atomic::Ordering::SeqCst); | ||
|
||
// Log state changes | ||
if previous_state != resource_ok { | ||
if resource_ok { | ||
info!("Resource utilization back to normal - requests will be accepted"); | ||
} else { | ||
warn!("Resource utilization too high - requests will be rejected"); | ||
} | ||
} | ||
}, | ||
_ = &mut shutdown_rx => { | ||
trace!("Resource monitor shutting down"); | ||
break; | ||
} | ||
} | ||
} | ||
}); | ||
} | ||
|
||
/// Middleware to check system resource utilization before processing requests | ||
/// Returns 503 Service Unavailable if resources are over-utilized | ||
pub async fn check_resource_utilization_middleware( | ||
req: ServiceRequest, | ||
next: Next<impl MessageBody>, | ||
) -> Result<ServiceResponse<impl MessageBody>, Error> { | ||
|
||
let resource_ok = RESOURCE_CHECK_ENABLED.load(std::sync::atomic::Ordering::SeqCst); | ||
|
||
if !resource_ok { | ||
let error_msg = "Server resources over-utilized"; | ||
warn!("Rejecting request to {} due to resource constraints", req.path()); | ||
return Err(ErrorServiceUnavailable(error_msg)); | ||
} | ||
|
||
// Continue processing the request if resource utilization is within limits | ||
next.call(req).await | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.