AI SRE that takes the night shift

The AI-first platform for on-call, incident response, and status pages – eliminating the interrupt tax across your stack.

Start for free

Book a demo

Benefits

AI-first technology for modern teams with fast response times

ilert is the AI-first incident management platform with AI capabilities spanning the entire incident response lifecycle.

Respond

Save your REM Sleep

ilert AI SRE.
Investigates, explains, fixes.

ilert AI SRE is an intelligent agent that analyzes incidents in real time. It connects to your observability stack, investigates alerts across systems, surfaces actionable insights, and resolves incidents.

Features

Analyzes logs, metrics, and recent changes
Identifies root causes and similar past incidents
Suggests rollback paths and other affected services
Can resolve incidents autonomously

More about our AI SRE

ilert AI

Intelligent agents for every stage of the incident lifecycle.

Prepare

Set it and forget it

On-call schedule assistant

Share your scheduling needs in a simple, chat-like interface. Add team members, rotation rules, and timeframes — and get a ready-to-use on-call calendar everyone can access.

Respond

The first line of response

Let AI take the call

Introducing the ilert AI Voice Agent – your first responder for calls, gathering key details and informing your on-call engineers.

Communicate

Keep stakeholders and clients informed

Status updates in no time

ilert AI analyzes your system and incidents, offering quick updates and managing communications for efficient issue resolution.

Learn

Accelerate feedback loops

Postmortems that drive change

Paired with ilert ChatOps features, ilert AI seamlessly gathers context from your incident chat channels and alert timelines. It then synthesizes that data into a clear, structured, and actionable postmortem – automatically. By removing the manual effort of documentation, ilert AI transforms postmortems from a chore into a powerful learning tool, giving your team more time to focus on what really matters: driving improvements and preventing future incidents.

Start for free

Integrations

Get started immediately using our integrations

ilert seamlessly connects with your tools using our pre-built integrations or via email. ilert integrates with monitoring, ticketing, chat, and collaboration tools.

Start for free

View all integrations

Transform your Incident Response today – start free trial

Start for free

Customers

See how industry leaders achieve 99.9% uptime with ilert

Organizations worldwide trust ilert to streamline incident management, enhance reliability, and minimize downtime. Read what our customers have to say about their experience with our platform.

“Other teams are now checking whether they would also use ilert.”

Thilo Maass

Manager at Adesso AG

If you want to get alerting that works with lots of different platforms, is easy to set up and is cost effective, then get ilert. There are lots of platforms out there, but how often can you say that you haven’t had to engage with the support team once in 3 years? That alone is worth its weight in gold.

Michael Ives

Group IT Director

We are using ilert to fix our problems sooner than our customers are realizing them. ilert gives our engineering and operations teams the confidence that we will react in time.

Dr. Robert Zores

Chief Technology Officer

Stay up to date

Expert insights from our blog

Insights

ITIL vs. DevOps: What is best for your organization?

ITIL vs. DevOps – key differences, pros, and cons. Find the best approach for your organization and optimize incident management with ilert.

Daniel Weiß

Mar 12, 2026 • 5 min read

When it comes to IT services and operations we find ourselves straddling ITIL and DevOps—two very different approaches with different philosophies. ITIL is all about structured processes and stability, DevOps is all about speed, collaboration and automation. But how do you choose the right one for your organisation?

‍

At ilert we specialise in bridging the gap between structured incident management and agile response strategies. Whether your team follows ITIL best practices, is a DevOps culture or is a mix of both ilert ensures your incident response processes are efficient, automated and reliable. In this article we’ll break down the differences between ITIL and DevOps so you can decide which one fits your organisation’s goals.

‍

What is ITIL?

ITIL is a framework for IT service management that provides best practices for aligning IT services with the needs of the business. It was developed by the British government in the 1980s and has since been adopted by organizations around the world.

‍

ITIL provides a structured approach to service management with precisely defined processes and procedures. It helps organizations improve their service quality, optimize their resources, and manage risks. ITIL can be used to support a wide range of IT services, including those provided by cloud providers.

What is ITIL used for?

ITIL is a process-oriented approach which focuses on identifying and managing the individual steps required to deliver high-quality IT services, ranging from the development and deployment of new services to monitoring and optimizing service quality.

‍

ITIL can also be useful for companies transitioning to the cloud, as it provides guidance on how to align IT services with the relevant business requirements.

‍

In addition, ITIL can be used to improve communication between the IT department and other areas of the business in order to create a collaborative environment within an organization. ITIL can also support organizations with managing risks and ensuring compliance with regulatory requirements.

‍

The main benefit of ITIL is that it helps to standardize your IT operations, which can make managing complex environments easier and improve the efficiency of your IT department. ITIL can also help you to document and track changes to your IT infrastructure, so that you can identify and address issues more quickly. A downside of ITIL is that it can be inflexible and slow to adapt to changes.

‍

Is ITIL still relevant?

Many companies struggle with the implementation of ITIL because the framework is complex and difficult to follow. As a result, some experts say that ITIL is no longer relevant in today's fast-paced digital world. However, modern ITSM practices increasingly integrate ITIL with Agile and DevOps methodologies, allowing organizations to combine structured service management with faster and more collaborative software delivery.

‍

One reason why ITIL is seen this way is that the framework provides a comprehensive approach to ITSM. It is also not prescriptive, meaning that organizations are flexible in how they implement it. This can make it difficult for companies to understand where to start and how best to use the framework to meet their specific requirements.

‍

The latest version, ITIL 4, was released in 2019 and introduced a more flexible and modernized approach to service management. Rather than following the rigid lifecycle stages of its predecessor, ITIL 4 is built around the Service Value System (SVS) and a Service Value Chain; a shift designed to make the framework more adaptable to Agile and DevOps ways of working. This makes it better suited to organizations looking to retain structured service management practices without sacrificing speed or collaboration.

What is DevOps?

DevOps is a methodology that unites development and operations teams around shared goals, processes, and responsibilities.. One of the main benefits of DevOps is that it can help you accelerate the deployment of new features and updates. This is because DevOps is based on the principle of "Automation first" - this means that manual processes are automated as much as possible, such as the provision of servers and the implementation of code changes. Modern DevOps practices also emphasize continuous integration and continuous delivery (CI/CD), infrastructure automation, and strong observability to support reliable and frequent software releases. Furthermore, DevOps adds the "human element" and shows how teams can work together to achieve more than the sum of their individual efforts alone.

‍

Because DevOps fosters a culture of collaboration between development and operations teams, issues can be identified and resolved more quickly. DevOps is particularly suited to breaking down information silos. One of the drawbacks of DevOps, however, is that it can be difficult to implement, particularly in large companies.

ITIL vs. DevOps

One of the most important differences between ITIL and DevOps is the emphasis on speed. ITIL prioritizes managing and improving existing services, whereas DevOps is more geared towards delivering new features and updates as quickly as possible. Another difference is the scope of each approach. ITIL is a framework for the management of all aspects of IT services, while DevOps primarily deals with the software development lifecycle.

‍

So which approach is right for your organization? If you want to improve the efficacy of your existing IT processes, then ITIL is the right choice. If you want to accelerate the delivery of new features and updates, then DevOps is the right approach. But if you want to get the best of both approaches, you can use them together. Many think of ITIL and DevOps as an either-or decision, but in reality, they are complementary approaches. In practice, many organizations adopt a hybrid model in which ITIL provides governance and service management structure, while DevOps practices enable faster development, automation, and continuous delivery.

How can you combine ITIL and DevOps successfully?

ITIL and DevOps go together excellently. If you want to successfully combine ITIL and DevOps, you should first consider how to best integrate the two concepts. Think of your problem as a basis for this. It is important to establish a common framework for the collaboration of teams. In addition, you should integrate DevOps principles into your ITIL processes and vice versa. This way you can ensure that both concepts are working optimally.

‍

Advantages of successful integration include:

Improved IT service quality
Faster deployment of new features and updates
Reduced risks
Greater flexibility when adapting to changing business requirements
Faster response to change requests
Better software quality
Reduced complexity in your IT environment
Less effort needed for change management.

‍

Conclusion

ITIL is still relevant today because it provides a framework for ITSM. The framework sets out best practices for delivering high-quality IT services and aligning IT services with business goals. It also helps organizations to improve their IT service processes. In addition, ITIL provides guidance on how an IT service organization can be effectively managed and operated.

‍

DevOps is not a replacement for the ITIL framework, but a complement. By combining DevOps with the ITIL framework, businesses can respond to changes faster and improve the quality of their software. By reducing the complexity in your IT environment and the effort for change management, DevOps teams can work more efficiently. In short, the combination of both can improve the quality of ITSM.

‍

Insights

On-call compensation for IT engineers in 2026

2026 guide to on-call pay: TVöD benchmark, global stipends, pay models, standby laws, and well-being best practices.

Daniel Weiß

Mar 12, 2026 • 5 min read

Imagine it’s 2 AM and a critical system flatlines without warning. A bleary-eyed on-call engineer scrambles to restore service, shielding customers from a major outage that could torpedo your next Service Level Objective (SLO) review. Yet when daylight returns, debates over fair on-call compensation start all over again: What’s “just” pay for sleepless nights, unpredictable pings, and rapid-fire incident responses?

What counts as on-call?

On-call is a special working hour arrangement under employment law. It comes into effect when the employee is obliged to be contactable at least by phone, so they can start work in an emergency. On-call duty is generally counted as time specifically meant for work purposes.

‍

In practice, this means that employees are normally not allowed to work while on call. However, there may be exceptions. For example, on-call employees may also work from home if they can be reached through their work device.

What's the difference between on-call and stand-by service?

There’s a time-and-location gap between the two models:

On-call – employees stay reachable (phone, pager, or on-call management app) and can log in from anywhere when an alert fires.
Stand-by – staff must be physically present on site and ready to act immediately. German labour law labels this Bereitschaftsdienst as working time and treats it accordingly.

In IT operations, remote on-call service is usually preferred because most incidents (code rollbacks, config tweaks) can be resolved over VPN. Stand-by still matters for latency-critical environments, for example, trading platforms or industrial control systems, where a technician must monitor hardware and intervene within seconds to meet strict service-level agreements.

Are on-call hours the same as work hours?

Whether on-call duty counts as working hours isn’t as clear-cut as it looks. Under most labour-law frameworks – including Occupational Safety and Health guidance and the U.S. FLSA Fact Sheet #22 – passive on-call time is treated as rest time as long as no alert comes in. The moment you’re paged and start troubleshooting, those minutes flip to active working time. In borderline cases, courts (e.g., Germany’s BAG, Oct 2023 ruling 6 AZR 210/22) (source available in German) decide which periods qualify, so definitions often vary by jurisdiction and company policy.

‍

There’s also no universal rule on pay. Many employers treat on-call duty as billable work and compensate it accordingly; others classify passive standby as unpaid availability. If your firm uses the latter model, remember you won’t be reimbursed for simply being reachable.

‍

Bottom line: on-call time isn’t always the same as working time – it hinges on the organisation’s compensation policy. Some U.S. big-tech companies (Airbnb, Apple, Netflix) don’t pay for passive standby, while many European tech firms do.

On-call duty times

On-call scheduling is usually confined to specific nights or weekends agreed in advance and written into the employment contract. Because fewer staff are on site during those hours, reliable night- and weekend coverage is essential.

‍

In Germany, the ICT trade group Bitkom recommends capping on-call assignments at 56 days per calendar year and guaranteeing at least 8 consecutive hours of rest per shift – Bitkom’s guideline on Rufbereitschaft im IT-Betrieb. On-call duty is generally classified as non-working time, so the usual 11-hour rest break required by §5 (1) of the Arbeitszeitgesetz does not apply until the engineer has actively worked on an incident.

‍

Need an easy way to keep those limits visible? ilert’s on-call scheduling shows every planned rotation and actual shifts at a glance, so teams stay compliant without spreadsheets.

‍

How is payment settled for on-call service in IT companies?

In IT companies, on-call hours are usually considered working time and are paid as such. As mentioned above, be sure to clarify this with your employer in advance to check what is stated in your contract.

‍

For large corporations like Airbnb or Apple, which do not pay for on-call time, the argument is that their employees are already among the top earners. This means that their employees still earn much more than they would at most companies that pay on-call time in addition to their salary.

‍

In Germany, there is no specific law regarding how on-call hours should be paid. This is, therefore, left up to the employer’s discretion. Most commonly, however, on-call duty is generally paid working time, i.e., the employee receives payment for the time he or she is on-call. This can be structured in different ways.

‍

In practice, on-call time is often compensated either on top of the standard hourly wage or with time off. In many companies, on-call time is also counted as working time and is paid for accordingly. However, this is only possible if the employee is working rather than being only available by phone. As already mentioned, this would be the case while working from home.

‍

In most tech organisations, hours spent on-call count as paid working time, yet the formula changes from company to company. Before you join a rota, double-check your contract or the internal on-call compensation policy.

‍

In practice, you’ll see two common models:

Hourly uplift

A percentage on top of the standard rate for every scheduled standby hour.

Time-off swap

Eight hours on-call earn four hours of paid leave.

‍

Remember, only the minutes you actively work are universally classed as working time; simply being reachable may stay unpaid unless your company’s policy says otherwise.

How are on-call services paid in IT companies?

Pay still varies by company size, sector, and risk profile. The federal collective agreement for public employees (TVöD) specifies the following allowances in § 8 Abs. 3:

Stand-by shifts of 12 hours or longer

Weekdays (Mon–Fri): paid at 2 X the hourly rate for the entire day.
Weekends and public holidays: paid at 4 X the hourly rate for the entire day.

Shorter stand-by windows (under 12 h)

Earn an additional 12.5 % of your hourly rate for each hour on call.

‍

For work in a large corporation or a successful start-up, you can expect to earn about €1,000 per week. At Zalando, the on-call compensation is roughly €1,050; at the start-up HelloFresh, €1,000; and at Amazon Germany, about €800. Several companies in the financial sector offer comparable rates, although exact amounts vary. Here are the stats provided by Pragmatic Engineer blog:

SumUp (Germany): €1,050 per week
N26 (Germany): €880 per week
Klarna (Europe): €500 per week
Mastercard (UK): £470 per week
PayPal (Germany): $350 per week
Wise (UK): £300 per week

Recent engineer forums and community posts add further reference points:

Google – Tier-1 SRE rota (five-minute response): paid for 40 minutes of every on-call hour outside office hours (66% of the base hourly rate). Tier-2 (30-minute response): 20 minutes per hour (33 %).
AWS (EU Tier-0 services) – 25% of base pay for each out-of-hours on-call hour, plus a half-day of paid time off for every Saturday or night-time page.

Beyond payment: safeguarding on-call well-being

Pay isn’t the only lever that matters. On-call duty disrupts normal sleep patterns and life outside work, so protecting responders’ well-being is critical. Your team will cope far better if you follow these five practices:

Set crystal-clear expectations for response windows and escalation paths.
Rotate shifts fairly with primary + secondary roles,use an automated on-call schedule so the rota is transparent.
Watch the workload: track pages per engineer and cap consecutive overnights with on-call reports.
Leverage tooling- alert deduplication and smart escalations in ilert’s on-call management cut noise and shorten time-to-sleep.
Provide regular training and support- run quarterly fire-drills or gamedays so responders stay confident under fire.

Quick summary

On-call duty in IT means being reachable outside normal hours to respond to incidents, usually remotely. It differs from standby service, which requires physical presence and is always counted as working time. Legally, on-call time isn’t always paid, only active incident response typically counts as working time. Compensation varies: some companies offer hourly uplifts or time-off swaps, while others, like Apple or Airbnb, don’t pay for passive standby. In Germany, Bitkom recommends no more than 56 on-call days per year with 8-hour rest shifts. Weekly stipends range from €800 to €1,050 at firms like Zalando, HelloFresh, and SumUp. To protect engineers, best practices include fair rotations, clear escalation paths, tooling to reduce alert noise, and regular training

‍

Engineering

Boosting Rust developer productivity with Cursor – Our journey at ilert

How we turned Cursor into a reliable AI pair programmer for Rust with context engineering, rules, and planning workflows.

Aleksandr Meshcheriakov

Feb 24, 2026 • 5 min read

AI-assisted coding has evolved from a novelty into an industry standard. At ilert, we started our adoption in mid-2023, quickly realizing that success depends heavily on proper context and workflows. This is particularly acute with Rust. While the language is central to our backend infrastructure, its strict compiler rules and distinct idiomatic approaches make it notoriously difficult for modern LLMs to master. Consequently, we spent significant time optimizing our AI-first development practices – an investment that has successfully eliminated development friction and streamlined our onboarding.

This article documents our journey from local code edits to a structured, context-aware workflow using Cursor. We will share the exact strategies, rule files, and planning workflows that turned the AI from a toy into a reliable contributor.

Our journey from code snippets to multi-agent workflows

At ilert, we are open to new technologies and actively monitor market trends, incorporating the best offers into our workflow. The AI was not an exception. At first, we treated AI as a "better Stack Overflow". We started by using ChatGPT for writing code snippets. We composed several custom GPTs for writing unit tests and documentation. Then subscribed to GitHub Copilot for smart autocompletion. And then came Cursor. At first, we were quite skeptical since the tool had quite a lot of bugs, and the outcome was far from desired. But its ability to utilize the index of the code base was quite promising.

Phase 1: Isolated tasks

Initially, we used Cursor for isolated tasks. An engineer might ask ChatGPT to "implement a POST endpoint for the Incident entity." The scope was small, and the results were often hit-or-miss. In Rust, this often resulted in code that technically worked but violated Rust idioms or existing architecture in our codebase.

‍Phase 2: Living documentation & rule files

We realized that the AI was only as good as the context it could "see." But rather than pasting context into every chat, we shifted to a different approach: treat documentation and rules as first-class code artifacts.

We introduced two key practices:

.cursor/rules/: Project-specific rule files (like rust-coding.mdc) that Cursor automatically loads into every interaction. These rules encode our engineering standards – error handling patterns, concurrency models, preferred crates – so the AI starts every task already "knowing" our conventions.
Living documentation: Files like ARCHITECTURE.md that document decisions, not implementation details.

This phase was transformational because the context became persistent and automatic. No more copy-pasting. The AI simply inherited our engineering culture from the repository itself. We will share the exact rule definitions and file structures that power this later in the article.

Phase 3: Plan Mode – architecture before implementation

The introduction of Plan Mode marked the next evolution in our process, shifting our workflow from 'generate code and iterate' to 'design first, build once'.

Even with proper rules, letting an agent immediately jump to code often produces solutions that are technically correct but architecturally questionable – instant technical debt. To counter this, we implemented a strategy we call the "Rule of Three". For any significant change, we start with the Plan Mode, framing the prompt as follows:


Goal: Implement JWT middleware for the HTTP service.

1. Scan ARCHITECTURE.md and README.md for related parts of the service
2. Scan the codebase to fill gaps in understanding
3. Propose 3 distinct architectural approaches
4. For each, list pros/cons and impact on maintainability and performance

Combined with our rule files, Plan Mode ensures the AI generates code that aligns with our long-term vision. The specification and implementation stay in the same context, allowing us to review the architecture before a single line of code is written.

A note on plan entropy: We noticed a specific limitation in the current version of Cursor: if you iterate on a generated plan too many times, the model tends to "forget" constraints and useful solutions established in the first version. To prevent this context drift, we trigger Ask Mode for complex tasks before entering Plan Mode. This allows us to clarify requirements and edge cases upfront, resulting in a robust initial plan that requires fewer follow-up changes.

Phase 4: Multi-agent orchestration

The latest evolution moves beyond a single AI assistant to coordinated multi-agent systems. Instead of a single agent handling everything, we now orchestrate specialized sub-agents, each with specific tools and expertise.

The general pattern looks like: top-level Orchestrator agent coordinates sub-agents:

Architect agent: Thinks on the problem from a high-level perspective, evaluating several solutions, may propose to make a POC before coming to a conclusion.
Implementation agents: The builders executing the planned changes.
Verification agents: Quality gatekeepers that ensure architectural standards, SOLID rules, proper testing, etc.

The power is in parallelization and specialization, effectively managing LLM's context and focus. Our early results show up to 2x faster iteration on complex tasks. We will explore this approach in detail in future articles. For now, let's start with the basics of AI-first development.

Context engineering: Why documentation is the new code

In the pre-AI era, documentation was often the last thing written and the first thing to go stale. A human developer could bridge the gap between an outdated README and the actual code. AI lacks that intuition. If your documentation contradicts your codebase, the AI will hallucinate a bridge between the two, resulting in code that appears correct but fails silently.

We realized that to make Cursor effective, we had to shift our mindset to "AI-first documentation."

Keeping the context window clean

As soon as you open a repository in Cursor, it indexes your code to understand the project context. If that index is filled with deprecated architectural decisions, the model's prediction quality degrades.

To combat this, we introduced a strict protocol: Documentation is a compile-time dependency.

The ARCHITECTURE.md file: Every service includes this file. It doesn't list endpoints (which change often); it lists decisions. This gives the AI and newcomer developers the "why" behind the code.
Standardized folder structure: We enforce a consistent layout across all Rust services (src/domain, src/infrastructure, src/api). Because every project looks identical, the AI can predict where a new file belongs with near 100% accuracy, reducing the need for us to specify paths in prompts.

The "Fix the Rule" loop

One of our most impactful productivity changes was how we handle AI errors. Previously, if Cursor generated code that violates our patterns, we would simply rewrite the code with follow-up commands. Now, we treat an AI failure as a documentation bug.

If Cursor generates code that struggles with the borrow checker or introduces a deadlock, we don't just fix the code – we patch the rule file or other context documentation. We ask: "What instruction was missing that allowed this mistake?" This turns every error into a permanent improvement for the entire team.

The secret sauce: ilert's Rust rules

Documentation provides the context, but Rules provide the constraints. Historically, the first approach was a single .cursorrules file at the repository root. Now we use several rule files in .cursor/rules/:

Language-specific rules
General programming best practices
Refactoring rules
Security analysis rules

Separation of files helps their maintenance, reduces LLM context, and facilitates their reuse across projects. For Rust, we use a dedicated rule file rust-coding.mdc (provided in the appendix). Here we will discuss several important points.

Key constraints we enforce

1. Modularity

The rule: Keep main.rs to a minimum – only main() that loads config and calls runtime.block_on(run(...)). Put the async entry point in a dedicated run.rs. In lib.rs and mod.rs, only list modules. Group modules by domain (e.g., src/http, src/config, src/use_case_xx).
The why: The thin main and lib modules facilitate testability and lifecycle reasoning. Domain folders and optional heavy dependencies (e.g., Kafka) ease integration testing without spinning up all the infrastructure.

2. The "Sync main" pattern

One of the main struggle points for Cursor with Rust is handling Rust lifetimes in asynchronous code. That leads it to produce a lot of unnecessary clones or mutexes, introducing accidental complexity.

The rule: "Start the main() in sync mode, configure the application, then launch the Tokio runtime at the end."
The why: This forces the AI to allocate long-lived resources (like database pools or config objects) in the stack frame of the main function before the Tokio runtime starts. Cursor can use a trick with Box::leak(Box::new(some_global)) to obtain 'static references, simplifying further lifetime management.

3. Data modeling & config

The rule: Newtype IDs, serde for DTOs with camelCase, derive_builder for complex construction, validator on incoming DTOs. Use the config crate with YAML and environment overrides so all settings are validated at startup.
The why: Consistent DTO representation and validation at the boundary hardens APIs, gives clear errors, and makes the AI generate the same patterns every time.

4. Error handling hygiene

The rule: "Define one thiserror enum for use-case-wide business logic and use anyhow for generic errors."
The why: It prevents the AI from creating too many custom error enums, while ensuring that business logic can benefit from matchable errors. We prefer From trait implementation over map_err(), and ban unwrap() in production code to facilitate clean and idiomatic Rust code.

5. Preventing async issues

The rule: "Strictly forbid holding std::sync::Mutex across an .await point."
The why: This is a classic Rust mistake that blocks the Tokio runtime. Syntactically, it looks correct and compiles.

6. Observability & external communication

The rule: Use tracing with structured fields; use impl FromRequest for Claims for Actix auth; use reqwest with retries and observability middleware for outbound HTTP.
The why: Sometimes Cursor may decide to use awc::Client, but it is not Send, making it harder to pass into tokio::spawn, for example. So we enforce reqwest-based client, along with its middlewares for production-grade error tolerance and observability.

The full rule file is in the Appendix at the end of this article. You can drop it into .cursor/rules/rust-coding.md and adapt it to your stack.

Experiment: No-rules vs. rules vs. planning

To quantify the impact of our workflow, we ran a controlled experiment. We gave Cursor the same prompt in three setups:

Naked: No rule files.
Guided: With rust-coding.mdc.
Architected: With rule file + Plan Mode.

Note on bias: We had to perform the "Naked" experiment on a fresh Cursor account. We found that Cursor's cloud-stored embeddings are sticky; even without rule files, it "remembered" our previous patterns from other sessions. A fresh account was necessary to see the true baseline performance. And here is another implication: if you expose Cursor to code with bad patterns, it may start repeating them in the new code it writes, so be careful what you 'feed' to Cursor. The prompt is intentionally quite vague to explore the bias in Cursor:


Implement the service in Rust:
- HTTP server based on Actix with JWT authentication
- Consume each Kafka topic defined in the YAML configuration
- On each message, make a POST call to /forward on the downstream service with hostname configured via environment variable
- Expose Kafka producer statistics at /stats

Scenario 1: The "Naked" cursor (no rules)

The generated project does not compile. We can notice that the Kafka consumer is spawned with tokio::spawn while passing in a shared awc::Client. awc::Client is not Send because it relies on Actix's single-threaded types (e.g., Rc), and tokio::spawn requires the future to be Send, so the compiler rejects the variable.

The problematic snippet:


pub fn spawn_consumers(config: Config, stats: SharedStats, client: awc::Client, downstream_host: String) {
    for topic in config.topics {
        let client = client.clone();
        tokio::spawn(async move {
            if let Err(e) = run_topic_consumer(..., client, ...).await {
                // handle error
            }
        });
    }
}

The code does not facilitate maintainability and extensibility:

Modules consist of a flat set of files in /src with no domain grouping.
Many modules have mixed functionality, such as message consumption and HTTP calls in kafka.rs, loading environment variables and setting up server routes in main.rs, etc.

Scenario 2: With a rule file, no plan mode

With the rule file active, the project successfully compiles. The Cursor chooses a Send-safe HTTP client backed by reqwest, with a separate facade for the HTTP client.

But we can notice that the modularity rules are not fully applied. For example, there is no module grouping by scope and functionality; all source files are in a single folder. More importantly, there are too many error enums: one in kafka.rs, another in config.rs, and so on. Cursor obviously misunderstood our rule.

So this way you get a working, compilable service, but the architecture is not as nice as it could be. It seems Cursor lacks project-wide thinking in this case. And here comes the Plan mode to the rescue.

Scenario 3: With a rule file and plan mode

With both the rule file in place and starting with a Cursor's Plan Mode, the generated service follows most of the rules consistently: HTTP client and configuration are in dedicated modules, modular layout with http/, forwarding/, server/, shared/ – suitable for testing and extension. The main.rs is quite clean:


fn main() -> anyhow::Result<()> {
    tracing_subscriber::fmt()
        .with_env_filter(EnvFilter::from_default_env())
        .init();

    let config_path = std::env::var("CONFIG_PATH").unwrap_or_else(|_| "config.yaml".into());
    let path = PathBuf::from(&config_path);
    let config = AppConfig::load(&path).context("load config")?;

    let runtime = tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .build()
        .context("create runtime")?;

    runtime.block_on(run(config))?;
    Ok(())
}

Outcomes: The shift from writers to reviewers

After years of refining Cursor workflows, the impact on our engineering velocity has been significant:

1. Streamlined onboarding

The most surprising benefit of a strict rule file and "AI-First Documentation" has been onboarding. When a new engineer joins ilert, they don't need to memorize all our coding guides before contributing. Cursor acts as a pair programmer who already knows our conventions.

2. From syntax to architecture

Our engineers now spend significantly less time writing boilerplate code, fighting the Rust borrow checker, or debugging obscure async runtime errors. The AI handles the "plumbing." This frees up our team to focus on thinking about complex business logic, paying more attention to details that matter. We have effectively shifted from being "Code Writers" to "Code Reviewers and Architects."

3. Predictability at scale

By treating our prompts and rules as code, we have achieved a level of consistency that is hard to maintain in a growing team. Spinning up new microservices and integrating them into our infrastructure landscape became obvious. The volume of code-review iterations decreased drastically.

Conclusion

If you are a Rust developer or an engineering leader looking to boost productivity, we encourage you to stop treating AI as a chatbot. Treat it as a junior engineer. Give it a handbook (.cursor/rules/), give it context (ARCHITECTURE.md), and force it to think before it types.

Based on our journey, here is our checklist for effective Cursor development:

Iterate on design, implement once: Start significant changes in Plan Mode, ask AI for different solutions. Use Ask Mode before Plan Mode in complex cases
Treat documentation as a dependency: Maintain AI-optimised (compact, structured, and non-contradictory) documentation with key decisions and constraints.
The "Fix the rules" loop: When the AI makes a mistake, don't just fix the code. Update your rules to prevent that class of error permanently.
Sanitize your context: Be careful with what you index. Use .cursorignore to exclude irrelevant information. If you feed Cursor a legacy codebase full of bad patterns, it will replicate them.

The tools are ready. It's up to us to build the future with them.

Appendix: ilert's Rust rules

The complete rust-coding.mdc we use. Feel free to copy and adapt as needed.


---
description: "ilert's Rust coding rules"
globs: ["src/**/*.rs"]
alwaysApply: true
---

# Modularity
- Group modules by business functionality: `src/payments`, users. Common utilities go into `src/shared`
- In lib.rs or mod.rs only list modules	
- Keep main.rs concise: it should contain only main() that calls configuration functions from other modules and launches the runtime
- Define run.rs with run() that is the entry point of the async runtime
- Heavy dependencies like Kafka producers should be optional, optimized for testing
- Separate boundaries under facades like Client classes for downstream HTTP services

# Data Modeling
- Newtype Pattern: Wrap primitive types for entity IDs (e.g., `#[derive(PartialEq, Eq, Hash, Copy, Clone)] struct UserId(u64);`)
- Annotate DTO structs for JSON requests and responses with `#[derive(Debug, Clone, Deserialize, Serialize)]`, `#[serde(rename_all = "camelCase")]`
- Annotate enums with `#[strum(serialize_all = "SCREAMING_SNAKE_CASE")]` and `#[derive(Clone, Display, Debug)]`, `#[serde(rename_all = "SCREAMING_SNAKE_CASE")]`
- For complex object construction, utilize `derive_builder` with `#[builder(setter(into))]`
- Use `validator` (`#[derive(Validate)]`) on incoming DTOs

## Concurrency
- Start the main() in sync mode (no #[actix_web::main]), configure the application, then launch the async runtime at the end with `runtime.block_on(run(...))`
- Pass complex variables initialized in main() as 'static obtaining the reference with `Box::leak(Box::new(something))`
- Wrap shared writable state with `tokio::sync::RwLock` (`Arc<RwLock>`), locking for the shortest period of time
- If the shared writable state requires change notification, use `tokio::sync::watch`
- Strictly forbid holding `std::sync::Mutex` or `std::sync::RwLock` across an `.await` point
- Handle graceful job cancellation and timeouts using `tokio::select!`
- For long-term async background jobs, like message consumers, use `&'static self` and initialize the instance before the launch of the async runtime

# Error Handling
- Use descriptive error types using `thiserror` v2, preserving source via `#[from]` or `#[source]`. Fall back to `anyhow` for short-scoped generic errors that are later mapped to `thiserror`; use `.context()`
- Avoid `map_err()` — instead, implement the `From` trait for the target error type
- Strictly avoid unwrap() in production code. But `.context()` for critical errors during application launch is fine

# Observability
- Use `tracing` for logging and tracing, attaching key-value pairs with important context
- Check the log level before dumping complex debug data with `if tracing::enabled!(tracing::Level::DEBUG)`

# External Communication
- For authentication in Actix use `impl FromRequest for Claims`
- Use middlewares for observability (`reqwest-tracing` for clients, custom for servers)
- Utilize timeouts and retries with backoff for outgoing requests (`reqwest-retry`)

# Config and Utilities
- Use `config` v0.15 and YAML configurations, with flexible environment overrides by `.add_source(Environment::default()).set_override_option("some.deep.key", std::env::var("CUSTOM_VAR").ok())`
- Use `itertools` to ease iterator transformations (`use itertools::Itertools; iter.join(", "); iter.chunks(10); iter.unique()`)
- Avoid adding comments, keep existing

Explore all

AI SRE that takes the night shift

AI-first technology for modern teams with fast response times

ilert AI SRE. Investigates, explains, fixes.

Intelligent agents for every stage of the incident lifecycle.

On-call schedule assistant

Let AI take the call

Status updates in no time

Postmortems that drive change

Get started immediately using our integrations

Transform your Incident Response today – start free trial

See how industry leaders achieve 99.9% uptime with ilert

Expert insights from our blog

ITIL vs. DevOps: What is best for your organization?

What is ITIL?

What is ITIL used for?

Is ITIL still relevant?

What is DevOps?

ITIL vs. DevOps

How can you combine ITIL and DevOps successfully?

Conclusion

On-call compensation for IT engineers in 2026

What counts as on-call?

What's the difference between on-call and stand-by service?

Are on-call hours the same as work hours?

On-call duty times

How is payment settled for on-call service in IT companies?

Hourly uplift

Time-off swap

How are on-call services paid in IT companies?

Stand-by shifts of 12 hours or longer

Shorter stand-by windows (under 12 h)

Beyond payment: safeguarding on-call well-being

Quick summary

Boosting Rust developer productivity with Cursor – Our journey at ilert

Our journey from code snippets to multi-agent workflows

Phase 1: Isolated tasks

‍Phase 2: Living documentation & rule files

Phase 3: Plan Mode – architecture before implementation

Phase 4: Multi-agent orchestration

Context engineering: Why documentation is the new code

Keeping the context window clean

The "Fix the Rule" loop

The secret sauce: ilert's Rust rules

Key constraints we enforce

1. Modularity

2. The "Sync main" pattern

3. Data modeling & config

4. Error handling hygiene

5. Preventing async issues

6. Observability & external communication

Experiment: No-rules vs. rules vs. planning

Scenario 1: The "Naked" cursor (no rules)

The problematic snippet:

Scenario 2: With a rule file, no plan mode

Scenario 3: With a rule file and plan mode

Outcomes: The shift from writers to reviewers

1. Streamlined onboarding

2. From syntax to architecture

3. Predictability at scale

Conclusion

Appendix: ilert's Rust rules

The solution for operation teams.

ilert AI SRE.
Investigates, explains, fixes.