The New Stack | DevOps, Open Source, and Cloud Native News

Traceability in AI-Enhanced Code: A Developer’s Guide

Josep Prat — Sun, 13 Oct 2024 17:00:18 +0000

The swift and widespread adoption of Generative AI has permeated business sectors across the globe. With transcription tools and content creation readily available, AI’s potential to reshape the future is endless. From software tools that AI will render obsolete to new ways of coding, it poses profound challenges for software development and the industry.

Today, the industry faces the challenge of solving a riddle: If a developer has taken a piece of code and modified it with AI, is it still the same code? One of the significant challenges facing software developers is how to do this without hampering creativity or overstepping the line regarding copyright or licensing laws.

To date, officials are stymied. The regulatory environment for AI is still evolving as policymakers and regulators work to address the potential ethical, security, and legal challenges posed by AI technologies. The U.S. Copyright Office has explored how AI-generated works intersect with copyright law. Still, there needs to be an established, comprehensive code or legal framework specifically governing how AI developers use copyrighted materials. In the UK, the Intellectual Property Office (IPO) confirmed recently that it has been unable to facilitate an agreement for a voluntary code of practice that would govern the use of copyright works by AI developers.

Balancing intellectual property rights with technological advancement as AI evolves remains a significant issue.

AI and OS — A Perfect Match

Open-source software provides fertile ground for training AI models because it lacks restrictions associated with proprietary software. It gives AI access to many standard code bases that run infrastructures worldwide. At the same time, it is exposed to the acceleration and improvements AI generates, further enhancing Open Source development capabilities.

Developers, too, massively benefit from AI because they can ask questions, get answers, and, right or wrong, use AI as a basis to create something to work with. This significant productivity gain is rapidly accelerating and refining coding. Developers can leverage AI to solve mundane tasks quickly, get inspiration, or source alternative examples of something they thought was a perfect solution.

Total Certainty and Transparency

However, it’s not all upside. The integration of AI into OSS has complicated licensing implications. General Public License (GPL) is a series of widely used free software licenses (there are others, too), or copyleft, that guarantee end users four freedoms: to run, study, share, and modify the software. Under these licenses, any software modification needs to be released within the same software license. If a code is licensed under GPL, any modification must also be GPL-licensed.

Therein lies the issue. Unless there is total transparency in how the software has been trained, it is impossible to be sure of the appropriate licensing requirements or how to license it in the first place. Traceability is paramount if copyright infringement and other legal complications are to be avoided. Additionally, there is the ethical question — if a developer has modified a piece of code, is it still the same code? We’ve covered that in more detail here.

Traceability

So the pressing issue is this: What practical steps can developers take to safeguard themselves against the code they produce, and what role can the rest of the software community — OSS platforms, regulators, enterprises, and AI companies — play in helping them do that? OSS offers transparency to support integrity and confidence in traceability because everything is exposed and can be observed. A mistake or oversight in proprietary software might happen, but because it is a closed system, the chances of seeing, understanding, and repairing the error are practically zero. Developers working in OSS operate in full view of a community of millions. The community requires certainty about where a source code from a third party originated — is it a human, or is it AI?

Foundations

Apache Software Foundation has a directive saying maintainers of their projects shouldn’t take source code done by AI. AI can assist them, but the code they contribute is the developer’s responsibility. If it turns out that there is a problem, then it’s the developers’ issue to resolve. Many companies, including Aiven, have a similar protocol. Our guidelines state that developers can use only the pre-approved constrained Generative AI tools. Still, developers are responsible for the outputs and need to be scrutinized and analyzed, not simply taken as they are. This way, we can ensure that we comply with the highest standards. What guidelines and standards has your company established, and how can you help establish them? These are good questions to ask if none exist.

Beyond this, there are ways organizations using OSS can also play a role by taking steps to safeguard their risks in the process. This includes establishing an internal AI Tactical Discovery team — explicitly created to focus on the challenges and opportunities created by AI. In one case, our team led a project to critique OSS code bases, using tools like Software Composition Analysis to analyze the AI-generated codebase, comparing it against known open-source repositories and vulnerability databases.

Creating a Root of Trust in AI

Despite efforts today, creating new licensing and laws around AI’s role in software development will take time. Consensus, which will be attained with investigation, review, and discussion, is required regarding the specifics of AI’s role and the terminology used to describe it. This challenge is magnified by the speed of AI development and its application in code bases. This process moves much quicker than those trying to put parameters in place to control it.

When assessing whether AI has provided copied OSS code as part of its output, factors such as proper attribution, license compatibility, and ensuring the availability of the corresponding open source code and modifications are necessary. It would also help if AI companies started adding traceability to their source code. This will create a root of trust that has the potential to unlock significant benefits in software development.

The post Traceability in AI-Enhanced Code: A Developer’s Guide appeared first on The New Stack.

What Platform Engineering Meant for Adidas’s SREs

Jennifer Riggins — Sun, 13 Oct 2024 13:00:49 +0000

Every second counts in e-commerce. So when the global sportswear company Adidas went from less than 3,000 to 29,000 requests per second last Black Friday, that translates to a tenfold increase in orders. If the website can handle it.

A few months prior, the site reliability engineering (SRE) team worked with tech leads, system architects and business to orchestrate a platform engineering transformation that modernized the Adidas architecture from monolithic to microservices. This should’ve enabled scalability.

But when the website’s ability to process coupons went down, this new complexity brought its own challenges — all while under the constant threat of sneaker bots (bots that buy sneakers for resale). At DevOpsDays London in September, the SRE team shared Adidas’s journey of platform engineering, observability, security and microservices.

The Cost of a Monolithic Pace

When Andreia Otto, senior director of software engineering for SRE and operations, joined Adidas seven years ago, the company’s software developers had a six-week release cycle.

“The development team was working during six weeks and then after that, we had a one-day call with so many people to go through all the changes, to go through everything that would be live,” she told the DevOpsDays audience. “It was throwing the ball to the other side. Development and operations were really separated.”

Also, infrastructure couldn’t scale in response to demand. The e-commerce team needed to increase order throughput for popular sneaker drops. A couple of years ago, Adidas ran a campaign that had to be extended by three days because so much throttling was needed in order to sell the in-demand stock.

It became clear that a move to microservices was essential to save time and money. Which it did.

“This year, we had a similar campaign that we could run in a couple of hours,” Otto said. “We mobilized way less people in way less time. It’s much cheaper for the company to sell, reducing operational costs and vendor dependency.”

The Adidas SRE team also aimed to reduce operational costs and vendor dependency by using a platform engineering approach to facilitate the move from monolithic to microservices architecture.

The team went from being able to plan for three days for their platform to accept 4,000 orders per minute to rolling out a release in a matter of hours that could accept 40,000 orders per minute.

“We had a big monolith serving everything from frontend to backend,” Otto said. “Now we have full control — obviously much more complexity, but still we have full control.”

The Adidas Microservices Architecture

Adidas engineering kicked off with implementing a top-down MACH architecture across consumer experience, website, B2B, retail stores and apps:

Microservices. The monolith was broken down into business capabilities, including the shopping basket, promotions, logging, B2B and checkout, each into at least one microservice with its own end-to-end CI/CD pipeline and independent lifecycle.
API-first. “We want to build the capability once and want to be able to use [it] everywhere,” Otto said, which includes being channel-agnostic and enabling reusable APIs.
Cloud native. This includes a dedicated team for Kubernetes clusters and the adoption of some Amazon Web Services.
Headless. “If you work in e-commerce, it’s very clear that the frontend needs to have much more changes during the day than the backend,” Otto said. “If a team needs to deploy more frequently, they [must be] independent of a team that needs to deploy less frequently.”

Andreia Otto and Ravikanth Mogulla of Adidas explained their company’s CI/CD pipeline to the DevOpsDays London audience (Source: Jennifer Riggins).

At Adidas, the platform engineering team provides the internal developer platform, Kubernetes, observability tooling and the Kafka-based data platform. The SRE team then sits between the platform and app development teams.

One goal of this move to a new platform, Otto said, was to enable developers by automating everything within Jenkins from the push to the deployment to Kubernetes. Then everything is integrated with Microsoft Teams to increase transparency among the whole engineering organization.

“If a branch or if any code that is deployed is allowed to go to production, we have an automation that will pop up a message in Teams and you can click the button” to approve the release, Otto said. “There was a conscious decision to not automate everything to production. Some teams are more mature, some teams are less mature.”

A more mature team may already have adopted progressive delivery techniques like canary deployments, but even they have that final approval before going to production.

New Platform, New Problems

The presenters showed a Willy Wonka meme that had the DevOpsDays audience LOL-ing:

Often in platform engineering, the why is a lot easier than the how. When Adidas shifted to a microservices platform in July 2023, it may have increased the speed to market, but it also increased its complexity dramatically in four main areas:

Availability of service.
Breaking changes in deployments.
Repeated code.
Security.

“Before, we had one monolithic, big service that we need to troubleshoot,” as an SRE team, Otto said. “Now, if something goes wrong, we have all of this to figure it out, to make sure that we have proper logs, proper tracing and the whole CI/CD is also independent. It means that the complexity is way higher.”

Andreia Otto and Ravikanth Mogulla of Adidas showed the DevOpsDays London audience how complex their systems became after moving from a monolith to microservices. (Source: Jennifer Riggins)

Immediately following the migration, the SRE team may have found they had more control, but the system’s complexity had also increased dramatically.

“If there is one thing that our team knows — and I think everybody here in this room also knows — is that you can prepare as much as you can, but things will go wrong, and we need to be ready for that,” Otto told the DevOpsDays audience.

Her co-presenter, Ravikanth Mogulla, checkout and payments site reliability engineer lead at Adidas, said that the teams faced some classic challenges, but “we started to improve with each and every incident.”

Here’s more about how the SRE team dealt with the challenges it faced once Adidas made the move to microservices and platform engineering:

Challenge No. 1: Availability of Service

The first challenge became observability, with the first major incident happening on Black Friday — when they typically experience 10x traffic. One of the databases responsible for the coupon functionality started having some performance issues, with a lot of calls getting piled up in the upstream services.

“We did not have proper timeouts across different microservices, and we also found that some of the microservices were not matured in terms of tracing,” Mogulla said. In addition, the team discovered that “the service which was also responsible for the issue did not integrate or it did not have the tracing with our [application performance management] tool.”

On one of the biggest shopping days of the year, Adidas had a mean time to recovery (MTTR) of close to 90 minutes. For context, high-performing organizations have an MTTR of under an hour. The SRE team was understandably stressed.

“We knew that something was wrong with the coupon service, [but] the biggest challenge for us was to pinpoint the exact issue because almost every call across all the microservices was failing. It was not straightforward, and we had to loop in a lot of teams,” Mogulla said. “By that time, it was almost 90 minutes, [and] we lost a lot of orders.”

Challenge No. 2: Breaking Changes in Deployments

The new Adidas microservices architecture is set up with Jenkins orchestration to cascade the calls from the different upstream channels of .com, web, desktop and mobile app to different downstream services.

One day, when the team worked together to update to a new version of the Jenkins SDK, while the other three services simultaneously updated smoothly, for some reason the forced sync of deployment resulted in a half-hour of downtime to the .com service.

“We realized that we were already with the different version and that resulted in close to 35 minutes of mean time to resolution,” Mogulla said. “We realized how much dependency the whole architecture has now.”

Challenge No. 3: Repeat Code

At Adidas, the SRE team is responsible to get the infrastructure from the platform team and to automate the full CI/CD pipeline. Once Adidas moved to a microservices platform, each app team became responsible for its own monitoring, security scans, secrets management, Helm charts and Terraform alerting along its own CI/CD pipeline. This resulted in a lot of teams doing the same thing in different ways, across 15 different pipelines.

The SRE team is working to ground Adidas engineering in the DRY Principle — don’t repeat yourself — as an aim to reduce redundant information, especially in areas that are constantly being changed.

The SRE wanted to use the internal developer platform to enable code reuse across the different app teams.

Challenge No.4: Security

The Adidas platform engineering team takes care of the infrastructure and much of the security, including ingresses and secrets management. It’s an uphill battle against sneaker bots — in fact, three out of the five highest-impact bots are targeting Adidas.

“Whenever we have any of the bigger launches or any of the hype articles, we are more susceptible to the bots,” Mogulla said. Therefore, “we do a lot of things from the [content delivery network], [which] gives us a lot of inbuilt features and we make sure that our endpoints are protected.”

However, Adidas, as a publicly traded company, is beholden to a lot of stakeholders. The business team is worried about false positives, he said, while the SREs are more interested in blocking the false negatives. Finding that balance became tricker as microservices exponentially increased the exposed endpoints to more than 200.

“You cannot block 100% of the bots. There are some bots that bypass all these [CDN-based security] features,” Mogulla said. “Either we find the unique criteria — be it with user agents or maybe TLS fingerprints or hash keys — a lot of criteria, like maybe the referrers [that] we use to block [or] we identify immediate LAN, the ASNs or the IP subnets, so it was straightforward.”

But the sneaker bots continue to become more sophisticated, spanning geographies.

Add to this, “Cloud native is for everybody, including the bots, so they just spin different containers or the clusters in the cloud — and then the attack is massive,” he continued. In response, the SRE team has set up authentication between services and reorganized incident management.

The New Adidas Observability Platform

Adidas looked to increase its resiliency by putting new failover, observability and security measures in place.

The SRE team started reviewing the timeouts across all the microservices to fix latency issues, determining the right timeout to set for each call, for each endpoint. The team also implemented circuit breaking, to shut off the failure in one system from affecting another, as well as adding an inbuilt feature flag tool.

“As SREs at Adidas, we have resilience and stability as the mindset,” Mogulla said. “And whenever we release something into production, the bigger changes, we always make sure this is behind a feature flag,” he said, with observability being mandatory.

“In fact, the entire transition from the monolith to the microservice was behind a feature flag. With just one click, we were able to switch from the legacy to the new microservice architecture.”

The team has also implemented canary deployments and SDK versioning. And, whenever a new version is ready, the upstream services deploy independently.

The Adidas SRE team switched to OpenTelemetry for “more vendor-neutral” end-to-end tracing.

“Prior to this [transformation], we had logs for different applications, but all of them were segregated between different tenants,” Mogulla said. “With microservices, we did not want to switch between different tenants. We use OpenSearch, and then, in a single tenant, but the logs are separated between the different indexes,” all within a one-stop observability dashboard.

With an eye on increased security — especially against the rampant sneaker bots — the SREs now protect all ingresses with SSL/TLS certificates with secured authentication, with Mozilla SOPs (standard operating procedure encryption) on all certificates. They also maintain seed code with IP addresses on allowlists.

“Every failure is an experience for us,” Mogulla said. It’s not easy, he added, when an organization has so many teams and microservices, “to have that proper coordination and to improve on each incident.”

Incident Management Is About Stability

Downtime is unavoidable. It’s how you respond that’s most important.

Adidas’s incident management is organized into two teams:

Site reliability engineering focuses on development and operations.
Service management focuses on process creation and auditing.

But you don’t start with incident management, Otto said. Instead, you start with defining stability for your organization. Only then, she said, can you measure it.

Stability at Adidas is measured in three metrics:

Mean time to detect (MTTD).
Mean time to recover (MTTR).
Revenue impact.

The latter creates a common language with the business stakeholders of the e-commerce platform.

“If we say we have a common KPI, then we don’t go into that debate: Should I prioritize value? Should I prioritize stability? No. We have one metric that we all understand,” Otto said. Which for Adidas and, frankly, most enterprises, “It’s all about money.”

Something qualifies as a major incident based on those three priority metrics. Then an incident manager is appointed to coordinate all communication among the relevant teams, including SREs, developers and marketing.

The following day includes an incident brief or root cause analysis (RCA). Adidas follows a technical template, which asks questions like:

What did we miss technically?
Is observability missing?
Was a test missing?
Process-wise, what happened?

This is an intentionally blameless postmortem, Otto emphasized, which is especially encouraged by leadership, because “the whole point of having the incident debrief [is] really going into what actually happened and create action items.”

An incident at Adidas closes with problem management, also run by the service management team, in order to prioritize the next steps — looking to ever-tighten the incident management feedback loop.

While Adidas made the switch to a microservices architecture over a year ago, the SRE pair ended by saying that observability and platform engineering are a continuous journey, not a destination.

“That was our journey, moving from monolithic to microservices. There is another one coming up with GenAI,” Otto concluded. “There is something else that will eventually pop up — it’s continuous learning.”

The post What Platform Engineering Meant for Adidas’s SREs appeared first on The New Stack.

How to Ensure Cloud Native Architectures Are Resilient and Secure

Akhil Mittal — Sat, 12 Oct 2024 17:00:08 +0000

Organizations are racing to innovate and scale with cloud native technologies in today’s fast-paced digital landscape. But in my experience, this rush often comes at a cost — especially regarding security. In a recent project with a financial services company, I saw firsthand how prioritizing speed over security exposed critical vulnerabilities.

At first glance, the company I worked with seemed like a cloud native success story: microservices spread across multiple regions, fully automated pipelines, and frequent feature releases. However, during a security audit, we discovered a severe vulnerability in how their APIs communicated, which put the entire system at risk. The team implemented broad API access controls to simplify scaling, which unintentionally created a significant security gap. With just one service compromised, an attacker could move laterally through the system, potentially accessing sensitive financial data.

In my experience, API vulnerabilities are becoming a common entry point for attackers, leading to many of the data breaches we see today. A 2023 Salt Labs report showed that 94% of organizations faced API security issues last year, mainly due to misconfigurations and poor visibility. These statistics underscore the significant risks that insecure APIs pose to businesses. Gartner predicts that by 2025, nearly half of enterprise APIs could go unmanaged, creating significant security gaps. As companies focus on speed and growth, security often gets left behind.

Microservices: Added Flexibility, Added Risk

Microservices offer flexibility and faster updates but also introduce complexity — and more risk. In this case, the company had split its platform into dozens of microservices, handling everything from user authentication to transaction processing. While this made scaling more accessible, it also increased the potential for security vulnerabilities. With so many moving parts, monitoring API traffic became a significant challenge, and critical vulnerabilities went unnoticed.

Without proper oversight, these blind spots could quickly become significant entry points for attackers.

Unmanaged APIs could create serious vulnerabilities in the future. If these gaps aren’t addressed, companies could face major threats within a few years.

Why Automation Alone Won’t Secure Your APIs

Automation helped the company release features quickly by scanning code and dependencies for security issues. Although automation worked initially, it overlooked more significant problems, such as overly broad API settings. Overreliance on automation led the team to overlook more profound design flaws. Although the automated tools caught more minor code issues, they failed to detect system-wide vulnerabilities.

I’ve noticed this issue increasing in cloud native setups. Teams often lean too much on automation without realizing these tools can miss subtle but critical issues, like overly broad API permissions or configuration shifts. While automation is crucial for speed, it’s not enough. Manual reviews and regular audits are essential to catching architectural flaws that automation might miss.

How We Fixed the Problem: Building for Resilience

Once we identified the vulnerabilities, it was clear the architecture needed more than a quick fix — it required a complete overhaul. Here’s how we addressed the issues:

Enforced Least Privilege for APIs: We reviewed all API interactions and reconfigured access controls to follow the least privilege principle. Each microservice was granted only the needed access, significantly reducing the attack surface.
Hardened Access Control Policies: Wide access controls were tightened, ensuring each service had only the necessary permissions. This reduced internal and external threats and created a more transparent audit trail.
Combined Automation with Manual Audits: While automation remained an essential tool, we added manual audits during critical points in development and deployment. These manual checks helped us uncover misconfigurations and design weaknesses that automation had missed.
Implemented a Service Mesh: To tighten up security between services, we implemented a service mesh, which gave us much better control over how APIs interact and, crucially, helped us keep a closer eye on communication patterns. Even if one service was compromised, the service mesh prevented lateral movement, minimizing damage.
Adopted Chaos Engineering: We used chaos engineering principles to stress-test the architecture, simulating failures and attacks. This helped us identify and fix weak points before they could be exploited.

Key Takeaways for Cloud Native Teams

The lessons from this project apply broadly to any organization using cloud native architectures. Here’s what you can do to protect your infrastructure:

Regularly Audit APIs: Ensure all API interactions follow the least privilege principle. Broad permissions create serious vulnerabilities, especially in microservice environments.
Harden Access Control Policies: Review and tighten access controls frequently to reduce risks. Regular audits are essential for catching overly broad permissions.
Combine Automation with Manual Audits: Automation is vital for speed, but manual reviews can catch more profound architectural flaws. Schedule regular audits to uncover misconfigurations and design issues.
Leverage a Service Mesh for API Security: A service mesh allows for tighter control over service-to-service communication and better visibility into API interactions.
Embrace Chaos Engineering: Stress-test your architecture by simulating failures and attacks to find weaknesses before they become critical.

Conclusion: Speed Without Security Is a Recipe for Disaster

As companies increasingly embrace cloud native technologies, the rush to prioritize agility and scalability often leaves security as an afterthought. But that trade-off isn’t sustainable. By 2025, unmanaged APIs could expose organizations to significant breaches unless proper controls are implemented today.

Your choices will determine whether your systems can withstand tomorrow’s threats. Don’t let the drive for speed and innovation become a security disaster. Resilience and security are just as important as agility.

The post How to Ensure Cloud Native Architectures Are Resilient and Secure appeared first on The New Stack.

Ubuntu Linux: Install the Suricata Intrusion Detection System

Jack Wallen — Sat, 12 Oct 2024 16:00:12 +0000

An Intrusion Detection System (IDS) is essential for monitoring network traffic and checking for malicious activity. If your servers are of the Linux type, you have plenty of options, one of which is Suricata.

Suricata is a high-performance, open source network analysis and threat detection software that is used by numerous private and public organizations and includes features like alerts, automated protocol detention, Lua scripting, and industry-standard outputs. It offers six modes of operation:

Intrusion Detection System (the default)
Intrusion Prevention System
Network Security Monitoring System
Full Packet Capture
Condition PCAP capture
Firewall

Most users will go with the default mode, which is a combination of IDS and network security monitoring, which ensures alerts include information about protocol, flow, file transaction/extraction, anomaly, and flow logs. You can read more about Suricata from the official site.

Suricata is free to install and use.

What I want to do is walk you through the process of installing this IDS on Ubuntu Server 22.04.

What You’ll Need

To get Suricata up and running, you’ll need a running instance of Ubuntu Server 22.04 and a user with sudo privileges. That’s it… let’s get to work.

Install the Necessary Requirements

The first thing to be done is the installation of the necessary requirements. Log into your Ubuntu server and install those packages with the command:

sudo apt install autoconf automake build-essential cargo cbindgen libjansson-dev libpcap-dev libcap-ng-dev libmagic-dev liblz4-dev libpcre2-dev libtool libyaml-dev make pkg-config rustc zlib1g-dev -y

When the above command completes, you’re ready to move on.

Download and Unpack the Source

Next, we can download the Suricata source and unpack it. Download the compressed archive file with the command:

wget https://www.openinfosecfoundation.org/download/suricata-7.0.6.tar.gz

You might want to visit the Suricata download page to ensure you’re grabbing the most current version.

Unpack the file with the command:

tar xvzf suricata-7.0.6.tar.gz

The above command will create a new folder, called suricata-7.0.6.

Build and Install the Package

We can now build the package. Change into the newly-created directory with:

cd suricata-7.0.6

In that directory, run the configure script with:

./configure --enable-nfqueue --prefix=/usr --sysconfdir=/etc --localstatedir=/var

The above command will take a minute or so to complete.

Finally, install the package with the command:

sudo make && sudo make install-full

The installation will take between 5-10 minutes, depending on the speed of your hardware.

Another method of installing Surcicata is via a PPA repository. Add the repository with the command:

sudo add-apt-repository ppa:oisf/suricata-stable

Update apt with:

sudo apt-get update

Install Suricata with:

sudo apt-get install suricata -y

Do note: I prefer installing with the PPA method because it adds a systemd startup file for easy service control.

Start the Service

With the installation complete, it’s time to start the service with the command:

sudo systemctl enable --now suricata

Configure Suricata

It’s time to configure Suricata. Open the configuration file with:

sudo nano /etc/suricata/suricata.yaml

I’m going to assume you’ll be using Suricata on a LAN. For that, look for the line that starts with HOME_NET. In that line, you’ll need to configure your subnet (such as 192.168.1.0/16).

Next, look for the af-packet line. Below that you’ll see -interface: eth0. You need to change eth0 to the name of your networking interface (which can be found with the ip a command).

Once that’s taken care of, you’ll need to add the following to enable live rule reloading. The following can be added to the bottom of the configuration file:

detect-engine:
– rule-reload: true

Save and close the file.

Update the Suricata Rules

With the configuration taken care of, you’ll then want to update the Suricata rule sets with the command:

sudo suricata-update

Running Suricata

It’s time to take Suricata for a test run. After the rules have updated, we’re going to test the rules with the following command:

sudo suricata -T -c /etc/suricata/suricata.yaml -v

You shouldn’t receive any error message, and the test will end with the following:

Notice: suricata: Configuration provided was successfully loaded. Exiting.

Restart the service with:

sudo systemctl restart suricata

Test Suricata

Let’s run a quick test. Below is a command used to trigger a false alert. Do this:

Log into the server from a second terminal (or tab). From the first window, issue the command:

tail -f /var/log/suricata/fast.log

From the second terminal, issue the command:

curl http://testmynids.org/uid/index.html

In the first window, you should see output like this:

09/04/2024-17:44:43.767928 [**] [1:2100498:7] GPL ATTACK_RESPONSE id check returned root [**] [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 2600:9000:24d7:6c00:0018:30b3:e400:93a1:80 -> 2600:1700:6d90:f6b0:0000:0000:0000:001c:35524

Suricata caught the false alert.

Now that you have Suricata up and running (and successfully tested) check out the official documentation for Suricata rules that can help you get the most out of this free, open-source intrusion detection system. Suricata is a fairly complex system to use, so I would recommend you go through the official documentation to better understand how it works.

If you’d prefer to manage Suricata with a GUI, I’d recommend checking out IDS Tower.

The post Ubuntu Linux: Install the Suricata Intrusion Detection System appeared first on The New Stack.

Deno 2.0, Angular Updates, Anthropic for Devs, and More

Loraine Lawson — Sat, 12 Oct 2024 15:00:07 +0000

The Deno team introduced a new Deno for Enterprise program Wednesday, along with releasing Deno 2.0.

Deno for Enterprise includes priority support, direct access to their engineers, guaranteed response times, and priority for subscribers’ feature requests.

Deno 2.0 also released Wednesday, and the focus was on enabling Deno to be deployed at scale.

“This means seamless interoperability with legacy JavaScript infrastructure and support for a wider range of projects and development teams,” the team wrote in a blog post announcing the release. “All without sacrificing the simplicity, security, and ‘batteries included’ nature that Deno users love.”

Deno 2 supports Next.js, Astro, Remix, Angular, SvelteKit, QwikCity and other frameworks, the team added. It incorporates native support for package.json and node_modules, as well as a stabilized standard library and monorepo support, the team wrote. It also adds:

Package management with new deno install, deno add, and deno remove commands
Support for private npm registries
Workspaces support
JSR: a modern registry for sharing JavaScript libraries across runtimes

It is backward compatible with Node and npm.

On top of that, there’s a slew of improvements to existing features, including:

deno fmt can format HTML, CSS, and YAML
deno lint has Node-specific rules and quick fixes
deno test now supports running tests written using node:test
deno task can now run package.json scripts
deno doc’s HTML output has improved design and better search
deno serve can run HTTP servers across multiple cores, in parallel
deno jupyter now supports outputting images, graphs, and HTML

It is backward compatible with Node and npm.

The blog post also outlines what developers can expect from the Deno 2.1 release, although it doesn’t specify when that release will be available.

Angular To Modify effect() API

Angular identified improvements to its effect() API, thanks to developer feedback during its preview phase. Angular team members Alex Rickabaugh and Mark Thompson shared two planned changes:

Removing allSignalWrites in version 19. “To encourage good patterns, our initial design for the effect() API prohibited setting signals unless the allowSignalWrites flag was explicitly set,” they wrote. “Through developer feedback and observing real-world usage, we’ve concluded that the flag is not providing enough value. We’ve found that it wasn’t effective at encouraging good patterns, and ended up discouraging usage of effect() in cases where it would be reasonable to update signals.”
Significant changes to the timing of when effects run. Previously, effects would be queued and scheduled independently as microtasks, but now effects run as part of the component hierarchy during change detection, they wrote. There are a few use cases where this may impact your projects, they warn, such as effects against view query results and to Observable() of input signals, which now emit earlier than before, affecting the timing of Observable chains.

“When testing this change at Google, we fixed around 100 cases where the timing change meaningfully impacted code,” Rickabaugh and Thompson wrote. “Around half of these were test-only changes, and in a few cases the timing difference led to more correct application behavior.”

Anthropic Message Batches API

Anthropic released a public beta of a Message Batches API late last week that allows developers to send batches of up to 10,000 queries per batch for processing in less than 24 hours, the company said.

The API costs 50% less than standard API calls, the Anthropic team wrote in a blog post about the new Message Batches API.

“This makes processing non-time-sensitive tasks more efficient and cost-effective,” the team wrote.

The ability to process so many queries at once makes the API better at tasks that require analyzing large amounts of information, such as customer feedback or translating large documents.

Also, instead of building systems to manage many requests, developers can send a batch of queries to Claude for handling, a company spokesman said. He added that the batch processing plus the lower costs makes it possible to do things that previously were too expensive, such as analyzing an entire document archive.

The API currently supports Claude 3.5 Sonnet, Claude 3 Opus, and Claude 3 Haiku on the Anthropic API. Support for batch processing for Claude on Google Cloud’s Vertex AI will soon be available as well. Customers using Claude in Amazon Bedrock can use batch inference, the team added.

Price does vary slightly by model used, with Claude 3 Opus being the most expensive, with Batch Input costing $7.50/MTok and batch output costing $37.50/MTok, according to the blog post and Claude 3 Haiku being the least expensive at $0.125/MTok for batch input and $0.625/MTok for batch output.

Sonar’s New Tools Clean Up AI-Generated Code

Sonar, which offers products to clean code, announced two new AI-driven solutions last week: Sonar AI Code Assurance and Sonar CodeFix.

Sonar AI Code Assurance is designed to improve the quality of code produced by generative AI. It analyzes the codebase for potential issues to ensure the code meets standards of quality and security.

Sonar offered the new tool as a way to address a pain point it saw in AI-driven development.

“AI is transforming the way developers work, streamlining processes, and reducing the toil associated with writing code,” Sonar CEO Tariq Shaukat said in a prepared statement. “As the adoption of AI coding assistants grows, however, we are seeing a new issue emerge: code accountability. AI-generated code needs review by developers, but accountability for doing this is increasingly diluted. As a result, we’re seeing the review step frequently being shortchanged.”

Sonar AI CodeFix enhances Sonar’s offering with AI to deliver a better developer experience. It allows developers to resolve issues detected by Sonar’s code analysis engine with a single click, directly within their workflow, the company stated.

The features are currently available for both SonarQube and SonarCloud.

The post Deno 2.0, Angular Updates, Anthropic for Devs, and More appeared first on The New Stack.

Linux: Create System Backups With rsnapshot

Jack Wallen — Sat, 12 Oct 2024 14:00:38 +0000

One step to data reliability is backing up your data on a regular basis. You never know when something could go wrong with a server or desktop, leading to a loss of critical files or configurations. To avoid such a nightmare, you might want to consider using a tool that handles incremental backups of local and remote file systems.

One such tool is rsnapshot, which benefits from using hard links, so disk space is used only when necessary. Rsnapshot works as a wrapper for the widely used rsync tool and is fairly easy to install and configure.

I’m going to walk you through the process of installing and configuring rsnapshot on Ubuntu Server 22.04, but you can make use of this application on most Debian-based distributions as well as those based on Fedora.

What You’ll Need

The only things you’ll need are a running instance of Ubuntu Server and a user with sudo privileges. Because rsnapshot can also back up to an external drive, you might also consider connecting such a drive for even better backup reliability. After all, should your OS go down and render the machine unbootable, if your backups are stored on the drive housing the OS, you could lose those backups as well.

For example, you might connect and external drive and mount it to a new directory named /backup, which is what I will demonstrate here. To make that happen, you might also want to configure that drive to automount at boot, which would require a line similar to this in the /etc/fstab file:

/dev/disk/by-uuid/13557fad-d203-4448-991b-c8011907dc1d /backup auto rw,nosuid,nodev,nofail,x-gvfs-show 0 0

Make sure you use your particular drive UUID and any options you prefer for the automounting of drives.

With that said, let’s get to the installation.

Installing rsnapshot

The rsnapshot package can be installed from the standard repositories with the command:

sudo apt-get install rsnapshot -y

If you’re using a Fedora-based distribution, the installation command is:

sudo dnf install rsnapshot -y

If your distribution of choice is Arch Linux, the command is:

sudo pacman -S rsnapshot

This should install all dependencies. If you find rsync doesn’t install, do so with:

Ubuntu: sudo apt-get install rsync -y
Fedora: sudo dnf install rsync -y
Arch: sudo pacman -S rsync

Configuring rsnapshot

Now that rsnapshot is installed, it’s time to configure it. One thing to keep in mind (and this is very important) is that you can’t use spaces in the configuration file; if you do, it will result in syntax errors. Instead, if necessary, use tabs.

Open the configuration file with the command:

sudo nano /etc/rsnapshot.conf

The first line you want to look for is this one:

snapshot_root  /var/cache/rsnapshot/

On a Fedora-based distribution, that line might read:

snapshot_root /snapshots/

The above line configures the directory that will house the backups. For example, if you’re going with my suggestion of an external drive mounted to /backup, the line would be:

snapshot_root /backup

You will also want to disable the creation of the root directory; otherwise, you’ll wind up with a child directory with /backup. To disable this feature, look for the following line:

#no_create_root 1

Uncomment the line by removing the # character so the result looks like this:

no_create_root 1

You’ll need to know the path to the rsync executable, which can be found with the command:

which rsync

The results should be /usr/bin/rsync. If it’s anything else, take note, because you have to configure that path in the following line:

cmd_rsync /usr/bin/rsync

Next, we need to set a retention policy. This is handed in the BACKUP LEVELS / INTERVALS section of the configuration file, where you’ll see the following default options:

retain  alpha   6
retain  beta    7
retain  gamma   4

The names above are arbitrary and the number is how many backups of that type will be retained. You can change the names if you’d like to, but remember that they should be in ascending order, and that the names you choose will be used to run the specific backups. You could change those names from alpha, beta and gamma to daily, weekly and monthly, which would make a lot more sense.

The next section to configure is what you want to back up. This section is listed under #LOCALHOST (near the bottom of the configuration file), where you’ll find the following:

backup  /home/          localhost/
backup  /etc/           localhost/
backup  /usr/local/     localhost/

You can change the directories to be backed up to whatever you need, but leave localhost/ as is; that instructs rsnapshot that we’re backing up to the local machine.

It’s also possible to exclude and include files from the backups. This is handled above the LOCALHOST section, where you’ll see the following:

#include        ???
#include        ???
#exclude        ???
#exclude        ???

For example, you might have specific files you don’t want to include in the backup. For that, make sure to create an exclude line with the direct path to the file in question.

Once you’ve taken care of the above, save and close the file with the Ctrl+X keyboard shortcut.

Testing the Configuration

Before you launch the backup, you should test the syntax of your configuration file with the command:

sudo rsnapshot configtest

If the command returns Syntax OK, you’re good to go.

Let’s run a test on the daily backup (which I used in place of alpha in the config file). To run the test, issue the command:

sudo rsnapshot -t daily

You’ll see output that looks something like this:

echo 28061 > /var/run/rsnapshot.pid
mkdir -m 0755 -p /backup/daily.0/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    /home/ /backup/daily.0/localhost/
mkdir -m 0755 -p /backup/daily.0/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded /etc/ \
    /backup/daily.0/localhost/
mkdir -m 0755 -p /backup/daily.0/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    /usr/local/ /backup/daily.0/localhost/
touch /backup/daily.0/

To run the first backup, issue the command:

sudo rsnapshot daily

When the backup completes, you’ll find a subdirectory named daily.0 in /backup that houses the snapshot.

Scheduling Backups

Rsnapshot doesn’t include a built-in scheduler, so you’ll have to make use of cron. What we’ll do is create three entries — one each for daily, weekly and monthly. Issue the command:

sudo crontab -e

At the bottom of the file, add the following lines:

0 1  * * *           root    /usr/bin/rsnapshot daily
0 5  * * 6           root    /usr/bin/rsnapshot weekly
0 2  1 * *           root    /usr/bin/rsnapshot monthly

The above lines do the following:

Takes a daily snapshot at 1 a.m.
Takes a weekly snapshot every Saturday.
Takes a monthly snapshot on the first of each month at 2 a.m.

And that’s it. You now have a backup system that will automatically take snapshots of the configured directories and save them to your chosen destination.

The post Linux: Create System Backups With rsnapshot appeared first on The New Stack.

Working With JSON Data in Python

Jessica Wachtel — Sat, 12 Oct 2024 11:00:53 +0000

JavaScript Object Notation (JSON) facilitates information sharing between applications. It’s the go-to choice for exchanging data between web clients and servers and for communication between APIs and various services or data sources. JSON’s format is both machine-readable and human-friendly, making it easy to parse, generate, and understand. Its efficiency in reducing data transmission makes JSON the go-to format for information sharing

JSON is:

Lightweight: JSON is minimalistic and efficient, making it suitable for transmitting data over networks.
Text-based: JSON’s plain-text format is composed of readable characters and symbols. Though JSON is based on a subset of JavaScript, it’s language-independent and compatible with most modern programming languages.
Structured data: JSON represents data as key-value pairs, which can be organized in nested structures like objects (dictionaries) and lists.

Working with JSON data is a common occurrence when building and maintaining applications.

JSON in the Real World

JSON performs the following tasks quickly and efficiently.

API interaction: Python applications often send and receive responses in JSON.
Examples: weather data and stock data sharing
Data Storage: JSON is used to store and share data within different parts of a system and across different systems.
Examples: configuration files and logs
Serialization and Deserialization: JSON converts data structures into a string format (serialization) and then reconstructs them into their original form (deserialization). This process simplifies data sharing and interchange.
Examples: storing user preferences and the transmission of complex data objects

Get Started: Working With JSON in Python Applications

Importing the json module is the first step.

View the code on Gist.

Parsing and Accessing JSON Data

Parsing data

Parsing JSON data converts JSON-encoded data (usually in string form) into a format that can be accessed within a programming environment. In Python applications, the JSON string is converted into native Python data structures like dictionaries and lists.

json.loads() is the json module’s data parsing function.

View the code on Gist.

Output: {‘apple’: ‘red’, ‘banana’: ‘yellow’, ‘cucumber’: ‘green’, ‘types’: [‘fruit’, ‘vegetable’]}

The json.loads() function is the same for both nested and unnested data.

Accessing Unnested Data

You can access the JSON data after parsing.

Hardcoded:

View the code on Gist.

Output:
red
yellow
[‘fruit’, ‘vegetable’]

Dynamic:

Accessing nested data

Often, JSON data is nested. This means the data includes dictionaries or lists within other dictionaries or lists.

Nested data is incredibly common because it mirrors how data is organized in the real world. An example of nested data is a user profile on an e-commerce site that contains personal details, addresses, and a list of purchases, all with their own attributes.

Accessing nested JSON data requires a different code setup than unnested data.

Hardcoded:

View the code on Gist.

Output:
Sarah Ellington
B+

Dynamic:

Working with nested data dynamically requires a recursive or iterative function. Though both work, recursive solutions are more elegant and slightly easier to read.

View the code on Gist.

Output:

Reading and Writing JSON Data to a File

Writing

json.dumps() saves JSON data in a file. json.dumps() converts Python objects into a JSON-formatted string, known as serializing the data. The serialized data is then written to a file or transmitted over a network. Writing nested and unnested data follows similar protocols.

Unnested:

View the code on Gist.

Nested:

View the code on Gist.

open('data.json', 'w') opens data.json in write mode. Write mode overwrites the file if it exists or creates a new file if it doesn’t.

View the code on Gist.

Reading

Reading data differs between nested and unnested data. Since nested data is so common when in doubt, write a function that reads nested data. The json module’s data reading function is json.load(file).

View the code on Gist.

Here’s the unnested version just for reference and comparison.

View the code on Gist.

Pretty Printing

Pretty printing JSON data improves readability and simplifies debugging by formatting the data with indentation and line breaks. Pretty printing provides a clear, accessible format for shared data. This helps with documentation and collaboration.

The indent parameter in the json.dumps() function formats JSON data with the indentation. Pretty printing follows the same process for both nested and unnested data.

The number following indent in the json.dumps() function dictates how many spaces to indent. Four is fairly common though indent best practices can vary by system.

View the code on Gist.

Output:

Error Handling

JSON presents different exceptions depending on how you interact with the data, but in all cases, using try…except blocks is the standard approach for handling errors when working with JSON.

View the code on Gist.

Output:
Failed to decode JSON: Expecting property name enclosed in double quotes: line 1 column 29 (char 28)

For more on how to handle errors, check out our error-handling tutorial.

Retrieving JSON Data from APIs

JSON is a common format for data transmitted through APIs, and the requests library is frequently used to handle such data in Python.

Here’s an example of how to retrieve JSON data from an API with error handling:

View the code on Gist.

Conclusion

JavaScript Object Notation (JSON) is a fundamental tool for data interchange between applications. Its efficiency in reducing data transmission makes it the preferred choice for web clients, servers and API interactions. Understanding JSON’s capabilities and best practices, including proper error handling and pretty printing, are the first steps toward working efficiently in real-world scenarios.

The post Working With JSON Data in Python appeared first on The New Stack.

Configure Microservices in NestJS: A Beginner’s Guide

Zziwa Raymond Ian — Fri, 11 Oct 2024 19:00:00 +0000

Monolithic architecture was the predominant approach to backend development before 2011. In this model, the entire application is structured as a single, unified codebase, where all components and services are tightly coupled and deployed together as one module. The monolithic approach encapsulates all business logic, data access, the user interface (UI) and other functionalities within a single executable or application.

While the monolithic approach offers simplicity in development and deployment, it introduces significant challenges as applications scale. With a single codebase, even minor changes necessitate rebuilding and redeploying the entire application, resulting in longer development cycles and higher risk of introducing errors. Moreover, scaling a monolithic application is often inefficient, as it typically requires scaling the entire system, even if demand increases in only one component. The tight coupling of components also leads to interdependencies, making the system more fragile and harder to maintain as teams and codebases grow.

The most critical drawback of monolithic architecture, however, is its extensive failure impact, often described as the “burst radius.” A failure in one component can cause the entire system to go down, leading to significant downtime. This interconnection heightens the risk of widespread outages and complicates troubleshooting and recovery. A single issue can cascade through the entire system, making it difficult to isolate and resolve without affecting other parts. Consequently, organizations often face prolonged downtimes, which can have a severe impact on business operations and the overall user experience.

Despite these drawbacks, the monolithic method was the standard for many years due to its simplicity and the lack of alternatives. However, microservices and other new architectural paradigms have provided more flexible and scalable solutions.

What Are Microservices?

In a microservices architecture, an application is composed of small, independent services that communicate with each other through well-defined APIs. Microservices divide an application into distinct, loosely coupled services. Each service is responsible for a specific piece of functionality; for example, in an e-commerce backend application, user authentication, payment processing, inventory management and other services, can be developed, deployed and scaled independently. This provides numerous advantages, including:

Scalability: Microservices enable the independent scaling of individual services. When one service experiences high demand, it can be scaled without impacting the entire application. This optimizes resource utilization and enhances overall performance.
Technology flexibility: In a microservices architecture, each service can be developed using the most suitable technologies, languages or frameworks for its specific needs. This flexibility allows development teams to select the best tools for each task.
Fault isolation: Due to the independent nature of microservices, a failure in one service is less likely to affect the entire system. This minimizes the impact of failures, enhances system resilience and reduces downtime.
Development and deployment speed: Microservices allow teams to work on different services concurrently, accelerating development processes. Continuous integration and deployment practices are easier to implement, enabling faster updates and more frequent improvements.
Simplified maintenance and updates: The modular structure of microservices makes maintaining and updating applications more straightforward. Changes can be made to individual services without affecting others, reducing the risk of errors and simplifying the testing process.
Organizational alignment: Microservices facilitate organizing teams around specific business capabilities. Each team can take full ownership of a service, from development through deployment and support, leading to increased autonomy, accountability and efficiency.
Enhanced agility: The modular design of microservices supports iterative development, allowing for more responsive adaptation to changing business needs and fostering rapid innovation.

Monoliths vs. Microservices: Structural Differences

In a monolithic application, all client requests are handled by a single, general controller. This controller is responsible for processing the requests, executing the necessary commands or operations and returning the responses to the client. Essentially, all business logic and request handling is centralized, which simplifies the development process.

In contrast, a microservices architecture introduces additional complexity with the inclusion of an application gateway. The application gateway serves as a crucial intermediary layer in a microservices setup. Here’s how it works:

Request handling: The gateway receives all incoming requests from clients.
Routing: It then determines the appropriate microservice or controller that should handle each request based on its routing rules.
Service interaction: The selected controller interacts with the corresponding microservice to process the request.
Response aggregation: Once the microservice has completed its task, it sends the result back to the controller, which then forwards it to the gateway.
Client response: Finally, the gateway returns the processed response to the client.

This layered approach separates the concerns of request routing and business logic, allowing each microservice to focus on its specific functionality while the gateway manages request distribution and response aggregation. If this sounds complex, don’t worry — I will walk through each component in detail and explain how it all works together.

Implementing Microservices With NestJS

NestJS is a progressive Node.js framework that leverages TypeScript, offering a powerful combination of modern JavaScript features, object-oriented programming and functional programming paradigms. It is designed to provide a native application architecture that helps developers build highly testable, scalable and maintainable applications.

In this tutorial, I will show you how to implement microservices using NestJS as the primary technology, NATS as the communication medium, Prisma as the object-relational mapping (ORM) technology, MySQL as the database and finally Postman to test endpoints.

This approach will demonstrate how to effectively manage microservices, ensuring they communicate seamlessly, are easily scalable and can be deployed reliably in a production environment. Along the way, I’ll cover best practices for setting up a microservices architecture, managing dependencies and securing a deployment, creating a solid foundation for building robust and efficient distributed systems.

Set Up the Base NestJS Application

Before you begin, ensure Node.js is Installed. Node.js is essential for running JavaScript code server-side and managing packages. If you haven’t installed Node.js yet, you can download it from the official Node.js website. Next, use npm (which comes bundled with Node.js) to install the Nest command-line interface (CLI), a tool that simplifies the creation and management of NestJS applications.

With the Nest CLI installed, set up your base NestJS application to serve as your gateway and name it api-gateway:

Launch your preferred text editor (e.g., VS Code, Sublime Text) and open the parent directory of the NestJS application (the directory that contains the parent folder of your base application). Navigate into the base application folder (i.e., api-gateway) and open a new terminal instance. Most modern text editors have built-in terminal capabilities. For VS Code, you can open the terminal by selecting Terminal from the top menu and then New Terminal. For Sublime Text, you might need to use a plug-in like Terminus to open a terminal within the editor.

Build the Backend Application

With your base application successfully up and running, the next step is to build a foundational backend application for a blog site. You’ll implement two separate services in this tutorial: one for managing readers, and another for handling create, read, update and delete (CRUD) operations for your blog articles. If you’ve worked with NestJS before, the project structure will be familiar and straightforward. However, I’ll provide a brief overview of the structure in case you’re not sure how it’s organized.

When you scaffold a new NestJS project, the default structure typically includes:

1. src: This is the main directory where most of your application code resides.

app.module.ts: The root module that ties together the different parts of the application.
app.controller.ts: The controller responsible for handling incoming requests and returning responses.
app.service.ts: The service that contains the business logic; can be injected into controllers.
main.ts: The entry point of the application, where the NestJS app is bootstrapped.

2. test: This directory contains the test files for your application.

app.e2e-spec.ts: The end-to-end test file.
jest-e2e.json: The configuration file for end-to-end testing with Jest.

3. node_modules: This directory holds all the installed dependencies for the project.

4. package.json: This file lists the dependencies and scripts for your project.

5. tsconfig.json: The TypeScript configuration file.

6. nest-cli.json: The configuration file for the NestJS CLI.

Create the Microservices and Gateway

The next step is to create two additional applications that will serve as microservices and name them reader-mgt and article-mgt. These applications will function as independent microservices within the architecture. After that, install the @nestjs/microservices and nats libraries to enable communication between services. Then configure these two applications to listen for requests via NATS, ensuring they can handle the incoming messages accordingly.

So, in the same folder, run the following commands:

With the two services now scaffolded, configure your gateway to handle client requests and route them to the appropriate services. First, install the @nestjs/microservices and nats dependencies. Then create a NATS module, which will be registered in the API gateway‘s app module to enable proper communication between the gateway and the microservices:

If you’re not already in the gateway folder, use the cd command to navigate into it. Once there, go to the src folder and create a new directory named nats-client, which will serve as the location for your NATS client configuration. After that, create a file named nats.module.ts within the nats-client folder, and add the following code:

This code creates a NatsClientModule, which you’ll later register in the API gateway’s app module. First, it imports the Module decorator along with the ClientsModule and Transport declarations from the @nestjs/microservices library that you installed earlier. Next, it registers the NATS_SERVICE and specifies the transport as Transport.NATS. NestJS supports various transport clients by default, but for this example, stick with NATS.

Then it defines an options object, which specifies the servers property and sets the NATS server address to nats://localhost:4222. Finally, it exports the registered NATS clients to make them accessible to other modules, which is useful if you have multiple modules within the gateway. For instance, you might have a users module, articles module, readers module and more. However, this tutorial uses a single controller and module for readers and articles.

With this complete, you can now proceed to the app.module.ts file and register the NatsClientModule:

At this point, you’re about 80% done with configuring the API gateway. The last step is to define the API routes in the app.controller.ts file. Navigate to that file and add the following code:

This defines the API routes that will handle incoming HTTP requests and forward them to the NATS service for processing. The AppController class uses the @Controller decorator to specify the base route for all endpoints, which is 'api/'. Within this controller, inject the NATS client using the @Inject decorator, associating it with the 'NATS_SERVICE' token. The controller includes several endpoints: POST /save-reader and GET /get-all-readers for managing readers, and POST /save-article, GET /get-all-articles and POST /delete-article for managing articles. Each endpoint method uses the natsClient.send method to send commands to the NATS service, passing the request body as the payload. This setup allows the API gateway to relay client requests to the appropriate microservices via NATS.

Finally, execute the npm run start:dev command to start the API gateway application. This will verify that the application runs smoothly and without any errors.

Figure 1: The api-gateway application

Configure Communication Services

Next, configure your services to handle requests from the running API gateway, process them and send responses back. However, before proceeding, there’s an important step: setting up the NATS server locally. Since you specified the NATS server address as nats://localhost:4222, both the gateway and services will expect a NATS server running on your local machine.

For development purposes, install the NATS server locally. Although you can run this in a Docker container, work with a local setup for simplicity. For Linux and macOS users, install the NATS server using brew install nats-server and run nats-server to start the service. For Windows users, use choco install nats-server. If choco is not recognized, ensure Chocolatey is installed by running:

Then verify the installation by running choco --version. If you need further guidance, consult the NATS documentation.

Figure 2: NATS server running locally

Configure Your First Service

Now you can configure your first service, article-mgt. Go to the main.ts file, which is the entry point of this service, and replace the default code with:

This code transforms article-mgt from a standalone application into a NestJS microservice instance and configures it to use NATS as the transport mechanism, specifying the server address (nats://localhost:4222) to connect to the NATS server.

Next, create a new directory named dto within the src folder, and then create a file named dto.ts, which will house the expected payload structure. DTO stands for data transfer objects, which are simple objects used to transfer data between different layers of an application, especially during network requests. In this context, DTOs help define the structure and type of the payload that the backend application expects from client requests. You can implement further validation using the class-validator dependency, if needed. However, to keep this article focused, we won’t use it here. You can learn more about class-validator in the NestJS official documentation, if you’re interested.

This code defines two DTOs for handling data in the article-mgt service. The saveArticleDto class specifies the structure for saving an article, requiring a title and content. And the deleteArticleDto class defines the structure for deleting an article, which requires an id to identify the article to be removed. These DTOs help ensure that the data passed between different parts of the application is well-defined, consistent and matches the expected type. There are three routes for articles but only two DTO classes are defined. This is because the third route, which retrieves all articles, does not require any payload.

Now go to the app.controller.ts file and modify the code.

Figure 3: Code in app.controller.ts

You might notice the red squiggly lines under the function names in the controller methods; this is because you haven’t yet defined these functions in app.service.ts. Before addressing that, let me explain the code: It imports the DTOs to enforce type checking on payloads, ensuring the data passed to the functions meets the expected structure. The @MessagePattern decorator specifies how messages should be handled. It takes an object where the cmd property defines a command string. This string must match the command specified earlier in the API gateway. The API gateway uses this command to determine which function to call for a given API request, attaching the command to the request before forwarding it.

Use Prisma To Interact With Your Database

To interact with your database using Prisma, create a Prisma module and service that you can use in the app.service.ts file. Start by creating a folder named prisma inside the src directory. Then, within this folder, create two files: prisma.module.ts and prisma.service.ts. The PrismaService in prisma.service.ts extends the PrismaClient class from Prisma and customizes the Prisma client by configuring the database connection URL using the DATABASE_URL environment variable. The PrismaModule in prisma.module.ts defines a module that provides the PrismaService, allowing it to be injected and used in other parts of the microservice for database operations.

Now, move to the app.service.ts and add the code below to define the necessary functions. To explain briefly: The saveArticle function takes data as a parameter, which must be of the saveArticleDto type, as defined earlier. The function uses a try-catch block to handle the process. First, it attempts to insert the data into the database. After that, it calls the getAllArticles function to retrieve the updated list of articles. Since getAllArticles is an asynchronous function, it uses the await keyword. Once this is done, the function returns an object as a response, containing the HttpCode, which includes the statusCode, a message and a string. Additionally, the getAllArticles function returns all articles from the database, and the deleteArticle function handles the deletion of an article based on the provided ID.

Start Your First Service

With that accomplished, you can now start your articles-mgt service and check if it runs smoothly without any errors. To do this, simply execute the command 'npm run start:dev'.

Figure 4: article-mgt service

Configure Your Second Service

Now that you’ve completed the article-mgt microservice configuration, move on to configuring the reader-mgt service. This service will handle two primary operations: registering readers and retrieving all registered readers. Since the setup process is quite similar to what I’ve already covered, I’ll skip the detailed explanations to save time. The implementation is essentially the same, just within a different service context.

To set up the reader-mgt service, start by navigating to the reader-mgt directory. Since the main.ts file in this service will have the same code as the article-mgt service, you can simply copy the content from the article-mgt main.ts file and paste it into the corresponding file in reader-mgt. Next, copy the entire prisma directory from the article-mgt service into the reader-mgt service. However, this won’t work immediately; you’ll need to install and initialize Prisma in the reader-mgt service. Run npm install Prisma @prisma/client to install Prisma, and then execute npx prisma generate to initialize it. Additionally, define the schema for the readers and perform a migration. Don’t forget to copy the database connection string from the .env file in article-mgt because without it, the reader-mgt microservice won’t be able to connect to the database.

Figure 5: Reader and article models

After defining the reader schema, run npx prisma migrate dev to apply the migration to the database, which will add the reader table to the MySQL database.

The final step involves configuring the app.controller.ts and app.service.ts files in the reader-mgt service. This process is similar to what you did in the article-mgt service. In the controller, define the routes, and then map these routes to corresponding functions in the service. You can use the article-mgt microservice configuration as a reference to guide you through this process.

Run Your Microservice

After configuring the app.service.ts, app.module.ts and app.controller.ts files for the reader-mgt service, the final step is to run the microservice to ensure that everything is functioning correctly and that there are no errors. This involves verifying that the routes in the controller correctly map to the functions in the service and that the microservice can handle requests as expected.

Once you’ve confirmed that all configurations are in place, you can start the reader-mgt service using the npm run start:dev command. This will launch the service in development mode, allowing you to check for any issues and ensure that the service is working seamlessly.

Figure 6: reader-mgt microservice

Test Your Application

If you’ve made it this far, congratulations! The coding portion of your project is complete, and your api-gateway, reader-mgt and article-mgt services are up and running without any errors. The next step is to use Postman to test the application and ensure that it performs as expected. Use Postman to send requests to the API gateway and verify that the operations are being correctly handled by the microservices. This will help confirm that all parts of the application are working together seamlessly.

Figure 7: /save-reader endpoint

Figure 8: /get-all-readers

The image illustrates the flow of the save-reader and get-all-readers endpoints as they pass through the API gateway and reach the reader-mgt microservice. The API gateway first receives the requests, identifies the correct commands and forwards them to the reader-mgt service via NATS. The reader-mgt service then processes the requests by creating a new reader or retrieving all readers. Once the requests are successful, the service returns the appropriate response.

Next, test the article-mgt endpoints by sending requests to create, delete and retrieve articles. First, send three create requests to the /save-article endpoint to add three articles to the database, as shown in Figure 9. Then send a request to the /delete-article endpoint to delete the article with an ID of 2. Finally, make a GET request to the /get-all-articles endpoint to retrieve the updated list of articles, confirming that the deletion was successful and the remaining articles are correctly listed in the database.

Figure 9: /save-article requests

Figure 10: /delete-article request

Figure 11: /get-all-articles request

Conclusion

Congratulations on reaching the end of this comprehensive setup guide! You’ve navigated through the intricacies of configuring a robust microservices architecture using NestJS, Prisma, MySQL and NATS. While you’ve successfully set up a functional microservices architecture, there’s always room for further enhancements.

As you continue to develop your application, consider implementing additional features such as robust error handling, security measures and comprehensive logging. Exploring Docker for containerization and Kubernetes for orchestration could further streamline your development and deployment processes.

Thank you for following along with this guide. Your dedication to mastering these technologies will undoubtedly pave the way for creating sophisticated and resilient applications. If you need the code for this blog, find it in my GitHub repository. Happy coding, and best of luck with your continued development!

The post Configure Microservices in NestJS: A Beginner’s Guide appeared first on The New Stack.

How Microsoft Edge Is Replacing React With Web Components

Richard MacManus — Fri, 11 Oct 2024 18:00:49 +0000

When Microsoft’s Edge browser team released WebUI 2.0 in May, a project that aimed to replace React components with native web components, its primary goal was to make Edge faster for end users. The core idea was that adopting a “markup-first architecture” would reduce JavaScript reliance on its product, which would mean less code to process on the client side — hence a better experience for the user.

To find out how the WebUI 2.0 project is going — including what inspired it and its ultimate goals — I spoke to Andrew Ritz, who leads the Edge Fundamentals team at Microsoft.

But first, let’s quickly clarify what web components are. The community site WebComponents.org describes them as “a set of web platform APIs that allow you to create new custom, reusable, encapsulated HTML tags to use in web pages and web apps.” Ritz puts it this way, when advising his own team how to approach this web development paradigm: “Anytime you want to do a new control [and] you find yourself writing any JavaScript, pause — stop — talk to a senior engineer and ask, how do you solve this with HTML and CSS?”

Why Did Microsoft Edge Decide to Ditch React?

Ritz says that his team’s aim is to convert around 50% of the existing React-based web UIs in Edge to web components by the end of this year.

But what was the impetus for the project — why did they decide they needed to move away from React in their web interfaces? Ritz explained that it began from looking at the work requests his “web desk” team at Edge was getting — “both external, to help improve the Chromium project, as well as internal requests.”

“…we [Microsoft] had adopted this React framework, and we had used React in probably one of the worst ways possible.”
– Andrew Ritz, Partner GM, Edge Fundamentals at Microsoft

An example of the latter was the Excel web app, which uses the Canvas element. So one of the questions they had to consider was, “How can we make Canvas more performant?” The HTML

To help the web desk team deal with requests like this, Ritz wanted to adopt a more “opinionated approach” that would also address issues such as slowness in web apps.

“And so what we did is we started looking at, internally, all of the places where we’re using web technology — so all of our internal web UIs — and realized that they were just really unacceptably slow.”

Why were they slow? The answer: React.

“We realized that our performance, especially on low-end machines, was really terrible — and that was because we had adopted this React framework, and we had used React in probably one of the worst ways possible.”

The use of React within Microsoft kept getting compounded over time, as more teams used it for their UIs. So the company ended up with “one just gigantic bundle that everybody was depending upon,” said Ritz. It was a mess of bundle dependencies across web apps.

“It was just this terrible experience, especially on the lower cost, lower-end machines,” said Ritz. “We were seeing multi-second startup times for something that is ostensibly local. It was just, you know, shocking.”

Edge Web UIs

Within Edge itself, there are between 50 and 100 web UIs, said Ritz, adding that “each of those are like their own little web application.” Around two-thirds of those Edge web UIs were built in React, before the Web UI 2.0 project started. Interestingly, the Edge team had originally used React in order to differentiate itself from Chrome.

“The team, as they were doing the port to Chromium, decided, well, we needed to add some kind of UI differentiation — different from what Chrome had — and so in the process of that, they did this kind of heavy conversion to React.”

So the current Web UI 2.0 project is, in a sense, rewinding much of the original development work done on Edge.

Ritz’s engineering team owned one of those React Web UIs: “browser extensions.” When you’re using Edge, it’s activated by clicking a heart icon in the browser bar, which opens up a sidebar. This then became the testbed to see what performance improvements could be made using web components for that UI, to replace the React components.

Edge browser essentials (on the right)

Are Web Components Too Hard?

Recently yet another debate erupted on social media about web components versus framework components. Ryan Carniato, creator of the SolidJS JavaScript framework, wrote a blog post with the provocative title, “Web Components Are Not the Future.” Essentially his argument is that a framework like SolidJS is able to do more than web components in certain scenarios, and is easier to implement. He dismisses web components as “a compromise through and through.”

In reply to Carniato, Shoelace creator Cory LaViska argued that web components offer stability and interoperability.

“The people actually shipping software are tired of framework churn,” wrote LaViska. “They’re tired of shit they wrote last month being outdated already. They want stability. They want to know that the stuff they build today will work tomorrow.”

LaViska also pointed out that web components don’t do all the things framework components do “because they’re a lower-level implementation of an interoperable element.”

It’s the kind of developer debate that rages endlessly on social media — it’s disappeared from the daily feed now, but you can bet it’ll be back in a month or two. In any case, I asked Andrew Ritz how his engineering team has adapted to web components and whether they’re as difficult to deploy as some critics claim.

“Our approach has been really to say, let’s use as many of the built-in constructs as possible,” he replied. “So as many of the built-in elements that exist within the browser, and it’s not been so bad to do this.”

“…effort to make web components perform well has definitely been an issue.”
– Ritz

Ritz noted that Edge developers have the advantage of using Microsoft’s own Fluent UI framework, which includes both React components and web components (among other types of components — such as mobile-centric ones for iOS and Android). But even using a company-wide framework to implement web components hasn’t been easy, he admits.

“There [have] been cases where [a] built-in control needs a lot of work — you know, it’s pretty heavy with polyfills, or things like that — that we’re just never, ever going to need. So effort to make them perform well has definitely been an issue.”

In terms of what Ritz calls “development agility” around web components (others might call it “developer experience“), he says that “we’ve actually seen some pretty good improvements.” For instance, being able to focus on HTML and CSS has meant that the developers and designers are aligned better — because they’re talking the same language.

“By us [the developers] focusing on using HTML and CSS, we kind of remove this entire translation layer, where somebody [in the dev team] might have to take, like, a wireframe and convert it to some other thing. […] And so that [was] a huge impediment to developer productivity for us, and we eliminate that entire loop.”

On Widespread Adoption of Web Components

It’s fair to say that it’s easier for Microsoft’s browser team to implement web components than the average web development team. Apart from having Microsoft’s Fluent UI framework to call on, the Edge team is also building a software product that only needs to cater to one browser: its own. Whereas almost every other web dev team has to make sure their product is usable on a variety of different browsers: from Chrome, to Edge, to Safari, to Firefox, and others.

“We have an easier time, perhaps, because we can say we only depend upon Edge for our local things,” is how Ritz puts it. “That can be like this true expression of [the] modern, latest web. Whereas a website owner — gosh, they might be forced to support Safari, or something, that doesn’t support half of the constructs that we’d like…and that brings complexity.”

“I’d be good proof if we could get some of the bigger non-web component websites within Microsoft to move over.”
– Ritz

That said, Microsoft’s intention is to release some of its WebUI 2.0 packages as open source — as well as a set of “web platform patterns.” However, Ritz notes that many external developers might not want to do things exactly the same way — for example many developers would want to choose a different styling framework than Fluent UI. But at the very least, Ritz’s team will be able to provide “learning patterns” for others.

An intermediary step will likely be trying to convince other Microsoft web products to make the move to web components.

“I don’t know what the rest of Microsoft exactly will do,” said Ritz. “We [the Edge team] kind of want to get our house clean with […] a common library and whatnot. But I think I’d be good proof if we could get some of the bigger non-web component websites within Microsoft to move over.”

But he added that they’re open to external partners, to help lead the way to a post-React world.

“If we could find someone external that was meaningful, that wanted to partner on this — by all means, we would be delighted.”

The post How Microsoft Edge Is Replacing React With Web Components appeared first on The New Stack.

Deploy Kubernetes Behind Firewalls Using These Techniques

Mohan Sitaram — Fri, 11 Oct 2024 17:00:06 +0000

As Kubernetes and cloud native systems become the de facto standard for deploying and managing modern applications, their expansion into restricted or firewalled environments brings unique challenges. These environments are often driven by regulatory compliance, security concerns, or organizational policies, which present architectural, operational, and security-related hurdles. This article delves into the intricacies of deploying Kubernetes clusters behind firewalls, offering solutions and strategies to overcome these obstacles.

A firewalled or restricted environment limits external internet access to ensure data security and protect systems from unauthorized intrusions. These environments are typical in industries with stringent regulatory requirements, such as finance, healthcare, and government. In such environments, only specific types of traffic are permitted, often with strict oversight. While these controls enhance security, they create significant challenges for modern cloud native infrastructures like Kubernetes, which rely on internet access for features such as cluster management, image pulling, and external API communications.

Challenges of Deploying Kubernetes in Firewalled Environments

Image Management and Distribution: Kubernetes applications require container images to be served from container registries such as Docker Hub, gcr.io, or quay.io. In firewalled environments, accessing these registries is often restricted or completely blocked. This can prevent image pulling, hindering the ability to deploy and upgrade applications.

Solution: To address this, enterprises can use registries that have repository replication or pull-through caching capabilities to host container images locally within the firewall. These registries can either replicate or pull images from external registries in a controlled manner, ensuring that the necessary container images are available without constant internet access. Registries like Harbor provide secure, internal image repositories for such environments. Further, utilizing image promotion workflows ensures that only vetted images from external sources make it into the secure registry.

Another approach I’ve used is to copy the images via a gateway or proxy server with connectivity to both source and destination registries. This solution might work where the source and destination registries’ capabilities are unknown. Tools like imgpkg, crane, or skopeo can copy images between registries that cross firewall boundaries. For example, the imgpkg packaging format bundles an application’s helm chart and its container images as a single unit. An imgpkg bundle can be exported as a tar archive from the source registry to the proxy server’s local filesystem. This tar archive can then be pushed to the registry running behind the firewall, and imgpkg ensures that the registry references in the application’s helm chart inside the bundle are automatically updated to point to the destination registry.

Cluster Management and Control Plane Access: Kubernetes’ control plane (API server, etc.) must communicate with the worker nodes and external cloud APIs to manage the cluster. However, in firewalled environments, external access to these APIs or control plane components is often blocked or limited, posing significant challenges for monitoring, scaling, and controlling the cluster.

Solution: Organizations can establish reverse proxying and VPN tunneling techniques to overcome this. A reverse proxy deployed in a demilitarized zone (DMZ) can handle API requests from within the firewall while providing a secure entry point. Additionally, bastion hosts and VPN gateways can allow controlled, secure access to the Kubernetes control plane. These hosts reside outside the internal network but act as a bridge between the restricted environment and external services, allowing administrators to interact with the cluster without violating firewall policies.

For example, Azure allows the creation of “private” AKS clusters that are deployed in an enterprise’s private network. Access to the control plane of private AKS clusters is restricted by default for security reasons. But Azure also provides solutions like Azure Bastion, which provides secure access to a private cluster from the outside world. The user connects to Azure Bastion via RDP or SSH from their local computer and can access the private cluster by proxy. Bastion takes care of securing traffic to the private cluster.

External Dependencies and DNS Resolution: Sometimes, an application running on an air-gapped Kubernetes cluster may need to pull an external dependency for which it may need to resolve a hostname outside the firewall. Access to public DNS like Google DNS or Cloudflare DNS may not be directly available from inside the pod, and the application may not be able to pull the dependency and fail to start. This will force the organization or the application developer to resolve the dependency within the firewall which may only sometimes be feasible.

Solution: Use DNS forwarding in CoreDNS. CoreDNS is the default DNS resolver in Kubernetes clusters and can be configured to resolve external DNS queries from within the firewall. CoreDNS can be modified to forward DNS queries to specific hostnames (like www.example.com) to external resolvers and resolve all other queries within the firewall. This is done by using the “forward” CoreDNS plugin to forward the query for www.example.com to Google or CloudFlare DNS and forward everything else (represented by a ‘.’) to the local resolver by just pointing them to /etc/resolv.conf This ensures that critical DNS resolution is not blocked by firewall policies and also allows the firewall administrator to keep their network secure by allowing only specific external queries.

Updates, Patches, and Kubernetes Components: Regular updates and patches to Kubernetes components are essential for maintaining security, compliance, and performance. However, automated updates may be blocked in firewalled environments, leaving clusters vulnerable to security risks.

Solution: Use local mirrors and internal container registries to update the cluster. Kubernetes installation tools like Kubespray allow cluster management in offline environments. Installing and patching Kubernetes via Kubespray requires access to static files like kubectl and kubeadm, OS packages and a few container images for the core Kubernetes components. Static files can be served by running an nginx/HAproxy server inside the firewall. OS packages can be obtained by deploying a local mirror of a yum or Debian repository. And the container images required by Kubespray can be served by running local instances of a ‘kind’ or docker registry with pre-populated images.

Additionally, companies can use continuous integration/continuous delivery (CI/CD) pipelines to handle updates in a controlled manner, with local testing and validation on staging clusters before rolling out changes to production clusters. GitOps is a subcategory of CI/CD that automatically deploys changes to a target environment triggered by commits to a Git repository. Staging and production clusters can be mapped to different Git branches and upgrades and patches can be rolled out strategically by committing changes to the staging branch first, testing it thoroughly, and only then committing the same change to the production branch. This ensures that the cluster is up to date with the latest security patches despite not having automatic updates.

Third-Party Integrations and Monitoring: Modern Kubernetes applications often rely on third-party integrations like Datadog and external storage solutions like AWS S3 or Google Cloud Storage. In a firewalled environment, outbound traffic is restricted, preventing direct communication with these cloud-hosted services.

Solution: Organizations can deploy self-hosted alternatives within their firewalled environment to maintain observability and monitoring. For example, Prometheus and Grafana can be deployed internally to handle metrics and visualization, while distributed storage solutions like Ceph or MinIO can replace external cloud storage. These tools can replicate the functionality of external services while ensuring that all data remains securely within the firewall. Container images and helm charts for self-hosted alternatives can be pulled into the air-gapped environment using the image management and distribution technique outlined earlier.

Security Policies and Compliance: Security and compliance concerns are often the primary reason for deploying Kubernetes in firewall environments. Industries like healthcare and finance require strict adherence to regulations like HIPAA and PCI-DSS, which mandate the use of secure environments with restricted access to sensitive data.

Solution: Kubernetes’ native features, such as Pod Security Policies (PSPs), Role-Based Access Control (RBAC), and Network Policies, can be leveraged to enhance the security of the Kubernetes cluster within a firewalled environment. Additionally, deploying service meshes like Istio or Linkerd can provide fine-grained traffic control and security, ensuring that only authorized services communicate. These meshes also offer mutual TLS (mTLS) for encrypting traffic between microservices, further enhancing security and compliance.

Ingress control and Load Balancing: In firewalled environments, external load balancing services (like AWS ELB or GCP Load Balancers) may not be available, causing difficulties in routing traffic to services running within the Kubernetes cluster. Kubernetes’ built-in NodePort-type services are not secure as they require a non-standard port to be opened on all the Kubernetes nodes. Each service that needs to be exposed outside the cluster requires a separate NodePort service, thus complicating the firewall administration.

Solution: To expose services outside the cluster, an ingress gateway like Istio or Contour can serve as a proxy that routes traffic to those services. They secure access to the internal services as they can terminate TLS traffic and serve as the single entry point for all services that need to be exposed.

Private load balancing solutions like MetalLB can be deployed to provide high availability of the IP/hostname for the ingress gateway. Using a combination of MetalLB and an ingress gateway improves security. There would be just one IP address/hostname to protect, and all network traffic to all exposed services would be encrypted.

Deploying and managing Kubernetes in firewalled environments introduces unique challenges, from image management and control plane access to DNS resolution and third-party integrations. However, with the right strategies and tools, organizations can harness the power of Kubernetes while maintaining the security, compliance, and operational stability required by their firewalled infrastructure. Techniques such as container registry image replication, DNS forwarding for specific queries, VPN tunnels, ingress gateways, and self-hosted monitoring tools ensure that Kubernetes remains a viable solution even in the most restricted environments.

Organizations aiming to adopt cloud native technologies behind firewalls must design their infrastructure thoughtfully, ensuring that security requirements are met without sacrificing the scalability and flexibility that Kubernetes offers. By leveraging the above solutions, Kubernetes clusters can operate effectively, even in highly restricted environments.

The post Deploy Kubernetes Behind Firewalls Using These Techniques appeared first on The New Stack.

eBPF Is Coming for Windows

Joab Jackson — Fri, 11 Oct 2024 16:00:56 +0000

At the virtual eBPF Summit last month, Thomas Graf, who is CTO and cofounder of Isovalent, talked about the future of the open source filter-turned kernel engine. And that future includes Microsoft Windows, he noted.

Microsoft researchers have embarked on a project to make a version of eBPF for Windows, which is to say give the Windows kernel a similar programmable interface.

Since its inclusion in the kernel a decade ago, the Linux-based eBPF has found widespread adoption, particularly for observability, security and compliance tools that benefit from its programmable in-line speed to analyze and filter packets without the need for cumbersome modules or dangerous kernel modifications.

With the promised cross-platform compatibility between Windows and Linux, tool makers can write binaries that run on both platforms.

eBPF … For Windows

Like the Linux eBPF, Windows eBPF will offer a sandbox to execute small programs within the kernel itself, using an enclaved in-kernel interpreter to execute eBPF bytecode, once the code is verified.

The Microsoft project, captured on GitHub, shows 43 contributors, with the code mostly written in C, with a smattering of C++.

The package will bring bytecode compatibility with Linux eBPF, Graf said, and also feature a similar interpreter and just-in-time compiler for bytecode execution. But the hook points where eBPF connects to the kernel may differ, given the differences with the Windows system calls.

Microsoft’s architecture for its eBPF for Windows kernel (Windows)

All the tooling that has been done for the Linux eBPF will also be ported over to Windows environs “in the coming years,” Graf said.

He warned that this will bring more challenges to the community. Going forward, tool makers will need to ensure that their wares work in both environments.

Hence the need for standardization.

eBPF Standardization

Originally, eBPF (which, the keepers now agree, no longer stands for anything) evolved as a set of code; it did not follow a pre-defined specification that it was implementing, Graf pointed out. As a result, the code itself “is the standard” that the tool makers must write to, he said.

The Internet Engineering Task Force (IETF) has embarked on a project to solidify things a bit more, as to guarantee as much “cross-platform” compatibility between Windows and Linux as possible, explained Dave Thaler, a technical advisor for the working group who is also one of the main contributors to the Microsoft eBPF project, in an earlier presentation this year for the Linux Foundation‘s Storage Summit.

The first task of IETF eBPF Working Group plans to solidify the Instruction Set Architecture (ISA) for the virtual machine that runs the eBPF programs. The body has largely finished the document that describes ISA, minus some last call feedback.

After the ISA work is finished, the group plans to also develop a set of expectations for the verifier, which guarantee the safe execution of untrusted eBPF programs. What should a verified do to ensure code is safe? What security properties does a verifier guarantee? For this work, the group can build from the Linux kernel’s verifier.rst for eBPF.

The group also plans to create a format for producing portable eBPF binaries via an ABI (application binary interface) specification, perhaps based on one of those already existing.

The post eBPF Is Coming for Windows appeared first on The New Stack.

Mojo’s Chris Lattner on Making Programming Languages Evolve

David Cassel — Fri, 11 Oct 2024 15:00:59 +0000

Modular’s new Python-based Mojo programming language — launched in 2023 — is designed for deployments on GPUs and other accelerators, according to its FAQ. “Our objective isn’t just to create ‘a faster Python,’ but to enable a whole new layer of systems programming that includes direct access to accelerated hardware.”

Or, as language designer Chris Lattner said in a new interview, they wanted to focus on “how we express” the hardware’s full capabilities. But they also wanted to “meet people where they are” — maintaining the look and feel of Python — since a lot of people, especially in the AI community, were using Python.”

But as a new language evolves, it’s a unique chance to explore not just the logic behind its language-design decisions, but also the underlying philosophy that will ultimately unite them all. What do programmers really care about in 2024? For the 100th episode of the podcast “Software Unscripted,” Lattner shared his insights from a career spent designing programming languages and compilers, along with lots of the infrastructure that goes around it.

Podcast host Richard Feldman — creator of the Roc programming language — began with the biggest question of all: Why? In a world with so many choices, why make a brand-new programming language at all? And Lattner had a quick answer: “It’s basically just a way to solve a problem.”

But it turns out that the gritty details of the answer — not just the “why,” but the “what” and the “how” — give a panoramic perspective on the whole programming language ecosystem of today.

Evolving Replacement Parts

Lattner agreed with Feldman’s description that Mojo is accessing Python’s ecosystem while “evolving replacement parts” to improve performance. And that work is ongoing. “One of our goals that we’re building into for this fall is make it super easy to make a Python package from Mojo,” Lattner said, partly to bring to Python programmers the advantages of Mojo’s better performance.

Mojo will be “subtracting all the complexity of interoperating with C,” Lattner said, while providing “the same performance, or better, that you get for C or C++.”

But then Lattner calls it “a natural thing” that happens as successful language communities start to scale up. “Programmers want to bring their skill sets forward — and as they do this, they bring it into adjacent domains that they want to apply the technology to.”

Of course, there’s a simpler answer, Lattner said later. “We’re building Mojo because we care about a lot of AI and GPU and accelerator stuff.”

“There’s so much stuff in the Python community that’s been about, like, ‘Let’s make Python unmodified go fast, and you can get 20x speed-ups,’ and stuff like this. But Mojo is saying, ‘Let’s not work forward from Python and try to make Python a little bit better.’ We’re saying, ‘Let’s work backwards from the speed of light of hardware!’ Unlock the full power of hardware — which isn’t just, you know, int being fast… but it’s also like accelerators and SIMD [single instruction/multiple data support for parallel processing] and like all these kinds of things…

“It depends on your workload, but we have a blog post showing that Mojo can be 65,000 times faster than Python.”

“And the usual pushback on that is… ‘Well, you would never write numeric code that’s doing math and dense arithmetic in Python.’ But that’s literally the point of Mojo — is to make it so all that code you would never write in Python. You can write in a language that’s coherent and consistent with Python. So you don’t have to switch languages!”

SIMD Support — and Second Syntaxes

To offer first-class support of parallel processing, Mojo’s SIMD-supporting syntax includes elements for all of the different number types.

Here Lattner paused for a pointed aside about programming languages today. “And oh, by the way, all processors have had SIMD since like the late ’90s — and so why is it that no languages have embraced modern computers? It’s unclear to me.” (Later he finds himself marveling again that “so much code is still single-threaded.”)

Mojo also found another way to improve its performance over Python. “Python’s integers are big integers,” Lattner said, “like you can make an arbitrary-sized integer, and it’s a heap-allocated thing, and it’s reference semantics — all this stuff.” Mojo kept the name — (lowercase) int — in its syntax but while also including its own alternate type in the standard library — Int (with an uppercase I). And this Int is a struct, greatly easing the work of Mojo’s compiler.

So while Mojo still supports Python’s original syntax, if you switch to Mojo’s version of int “you get way better performance. You get better predictability. It will run on a GPU, etc., etc. And it’s not about one being better than the other. It’s a tradeoff.”

Getting Complex — and Unlocking Superpowers

There’s another new feature in our modern world. As Lattner describes it, “Some of these chip people” decided it’s important to fully support math with complex numbers (which include so-called “imaginary” numbers, which are widely used in technical applications including engineering and physics formulas). “And so once somebody puts something into hardware and said, ‘Wow, this is 10x faster than doing individual multiplies and adds,’ you kind of have to say, ‘Okay, well, 10x improvement — because they put into silicon… How do we expose that and then make it so people can use this stuff without having to know about it?”

So Mojo allows the defining of structs using complex numbers, with simple behavior and “a bunch of methods — including multiply.” And then good things can happen if the compiler detects the presence of complex number-accelerating hardware with a special speedy instruction for multiplication. “Boom! Everybody that uses complex numbers can be accelerated. They don’t have to worry about it.”

With Mojo, Lattner did keep Python’s operator overloading — the ability to write your own customizations for a symbol’s default behavior — even using Python’s syntax to preserve compatibility. But in the end, a lot of that extra complexity has been outsourced to library developers.

At one point Lattner said he knows a lot of compiler engineers but wants to take the “ecosystem of talent” that exists higher up the stack where the library developers live — those programmers who deeply understand their domain, along with things like performance-enhancing GPU tricks. “My quest with Mojo is really about unlocking those people and giving them superpowers.”

Later he said this is in contrast to “a trend that I’ve seen over the last five or 10 years in this space… ‘Put sophistication into the compiler, lock it up, throw away the key and trust us — us compiler people have got it’… Because what I’ve seen is that that’s not actually really that true. … A compiler is rarely going to give you a 10x improvement. But somebody working at the application domain, because they know the application — totally can, because they can use the right tool for the job.”

The end result? In a domain where even numbers come in over a dozen different types, “we can just have people define these things in libraries… That’s actually great — and it reduces pressure on the language from having to churn with all the hardware!”

Mojo’s Community

The interview ended with Lattner calling out to interested programmers, suggesting they check out Mojo’s web page and its “ton of documentation. The community is amazing! We have a Discord channel — I think we have 20,000-ish people on Discord, all talking about different stuff, building things.”

He also added a word of encouragement. “Mojo is not just an AI language. It’s also being used to build web servers and GUI libraries and all kinds of stuff by the community. And so we love for people to get involved. Mojo is still pretty early, and so we’re still adding core capabilities and building out the libraries.

“But we have a huge community of people that are really enthusiastic, and I’d love for people to get involved.”

The post Mojo’s Chris Lattner on Making Programming Languages Evolve appeared first on The New Stack.

Runtime Context: Missing Piece in Kubernetes Security

Oshrat Nir — Fri, 11 Oct 2024 14:30:46 +0000

More and more organizations rely on Kubernetes to deploy and manage their applications. However, traditional security approaches often fall short of addressing the unique challenges posed by these dynamic, containerized environments. Integrating runtime context into Kubernetes security creates a feedback loop between posture management and runtime security, significantly boosting an organization’s overall security.

Limitations of Static Security Measures

Conventional security strategies typically rely on static analysis and predefined rules. While these methods are valuable, they struggle to keep pace with the dynamic nature of Kubernetes environments. Containers are ephemeral, workloads are constantly shifting, and the attack surface is ever-changing. Static security measures alone cannot provide the real-time insights necessary to detect and respond to emerging threats effectively.

Runtime context is the missing piece in the Kubernetes security puzzle. By continuously monitoring and analyzing the behavior of applications and workloads during execution, security teams can gain invaluable insights into potential vulnerabilities and anomalies. This real-time information allows for more accurate threat detection, reduced false positives and faster incident response.

Synergy of Posture Management and Runtime Security

To harness the power of runtime context, organizations need to establish a feedback loop between posture management and runtime security. This approach requires a unified platform capable of handling both aspects seamlessly.

Here’s how this synergy works:

Posture management: This involves assessing and enforcing security configurations, policies and best practices across the Kubernetes environment. It establishes a baseline for security and compliance.
Runtime security: This component continuously monitors the environment. In addition to detecting anomalies, potential threats and policy violations in real time, it assesses the needs of the workloads running on the infrastructure. This ensures that the information provided to posture management and static security is based on real-world behavior rather than relying solely on industry best practices.
Feedback loop: Insights gained from runtime security feed back into posture management, enabling continuous refinement of policies and configurations based on actual behavior and emerging threats.

eBPF in Enhancing Runtime Security

Extended Berkeley Packet Filter (eBPF) technology allows efficient, low-overhead monitoring and tracing of system calls, network activity and other critical operations without modifying the kernel or applications. Here are some key use cases where eBPF lends itself to Kubernetes security:

Automated secure computing mode (seccomp) profile generation: eBPF can be used to automatically generate and enforce seccomp profiles based on observed runtime behavior. This approach takes the guesswork out of creating seccomp profiles, reducing the risk of overly permissive or overly restrictive policies.
Real-time system call monitoring: eBPF enables real-time monitoring of system calls, providing immediate insights into potential security violations or anomalous behavior.
Automated network policy generation: eBPF can trace network activity at the kernel level, offering deep visibility into container communications and potential network-based threats. This data can then be leveraged to automate the creation of network policies.
Reachable vulnerabilities: eBPF can help judge whether a vulnerability is reachable and in use. This capability can be used to prioritize security patching and ensure that time and resources are directed to the highest impact security work.

Benefits of a Unified Platform

Implementing this comprehensive approach to Kubernetes security requires a unified platform capable of integrating posture management, runtime security and eBPF-based monitoring. Such a platform offers several key advantages:

Holistic visibility: A unified platform provides a single pane of glass for viewing and managing all aspects of Kubernetes security, from configuration to runtime behavior.
Contextual alerts: By combining insights from posture management and runtime security, alerts become more contextual and actionable, reducing alert fatigue and enabling faster response times and more focused responses.
Automated policy refinement: The feedback loop between runtime observations and posture management allows continuous, automated refinement of security policies based on actual behavior.
Reduced complexity: A single platform simplifies the security stack, reducing the operational overhead of managing multiple disparate tools.

Conclusion

As Kubernetes environments continue growing in complexity and scale, traditional security approaches are no longer sufficient. Organizations can improve their Kubernetes security posture by using a unified platform that integrates runtime context, posture management, runtime security and advanced technologies like eBPF. This comprehensive approach provides the real-time insights, adaptability and automation necessary to protect against evolving threats in today’s dynamic cloud-native landscapes.

The future of Kubernetes security lies in platforms that can seamlessly integrate these components, offering a holistic, context-aware approach to protecting containerized applications and infrastructure. As the threat landscape continues to evolve, organizations that embrace this unified, runtime-centric security model will be best positioned to defend against sophisticated attacks and ensure the integrity of their Kubernetes environments.

For insights into eBPF’s transformative potential in cloud-native security, attend the eBPF security use cases panel at Cilium and eBPF Day on Nov. 12, part of KubeCon + CloudNativeCon North America 2024.

To continue the discussion, visit ARMO at booth Q26 at KubeCon in Salt Lake City, Nov. 12-15.

To learn more about Kubernetes and the cloud native ecosystem, join us at KubeCon + CloudNativeCon North America, in Salt Lake City, Utah, on Nov. 12-15, 2024.

The post Runtime Context: Missing Piece in Kubernetes Security appeared first on The New Stack.

How to Install Python 3.13? Use the Interactive Interpreter

Jack Wallen — Fri, 11 Oct 2024 13:36:34 +0000

With the latest release of Python (version 3.13), there are several exciting features, including the new interactive interpreter. This interpreter features multiline editing with history preservation; support for read–eval–print loop (REPL)-specific commands (such as help, exit and quit) without having to call them as functions; prompts and tracebacks (with color enabled); interactive help browsing with a separate command history; history browsing; and paste mode.

Combined, these features make for a considerable leap forward in an interpreter that hasn’t seen a lot of new features appear in the recent past. For anyone who uses the Python interactive interpreter, this should be an early Christmas.

This interactive interpreter is based on code from the PyPy project and can be disabled by setting the PYTHONG_BASIC_REPL environment variable. The new interactive shell is available to UNIX-like systems (such as Linux) with curses support and Windows. By default, the interpreter uses color for things like prompts and tracebacks. It’s possible to disable the color option by setting the TERM variable to dumb.

Let’s look at how the new interpreter works.

Easier Exit

If you’ve used the Python interpreter, then you know exiting it requires the Ctrl+D keyboard shortcut.

Or at least it used to.

Now, the interpreter exit makes sense because all you have to do is type “exit.” As someone who’s been using the Linux terminal for decades, this is a welcome change. It never fails that when I’m finished using the interpreter, I type exit, only to be presented with an error.

Until Python 3.13, it was full-on Jean-Paul Sartre and no exit.

In the same vein, you can also now clear the interpreter screen with the clear command, which is very helpful when you need to start over and want a clean space to use.

Improved Error Messages

Confession time: When I first started learning Python, I had no idea that you had to be careful with file names. For example, I’d be creating an app that uses the random library module and name the file random.py. I’d then try to run the code, only to receive a rather cryptic message that gives me no indication about what was wrong.

Little did I know that the problem was the file name. Eventually I figured that out, changed the file name and re-ran the app without problems. Clearly, the error was not in the code itself.

With the new interpreter, those error messages are far less cryptic. For example, you might see something like this in the error message:

(consider renaming '/home/jack/PYTHON/random.py' since it has the same name as the standard library module named 'random' and the import system gives it precedence)

That certainly would have been nice back when I was taking my first steps with Python. I’d have saved a lot of time troubleshooting silly issues such as a file name conflict.

Speaking of error messages…

Color, Color Everywhere

Okay, the new Python interpreter doesn’t spill color over everything. What you’ll find is that color is enabled (by default) for prompts and tracebacks. What does this mean? It means you’ll be able to spot problems a lot easier from within output of the interpreter.

Let’s take our improved error messages feature for a ride. We’ll stick with our numpy.py example. If I attempt to run that app, I know I’ll get error messages because of the file name. However, with Python 3.13, those errors are colored for easier reading.

Figure 1

Error messages are not only smarter; they’re easier to read in Python 3.13.

Executable Scripts

Another cool feature is the ability to make a Python script executable on Linux, without having to run it with python3. To do this, you must add the following line to the top of your code:

#!/usr/bin/env python3

Save and close the file. Next, give the file executable permission with:

chmod u+x name.py

Where name is the name of your script.

Now, to run your Python script, all you have to do is issue the command:

./name.py

Where name is the name of your script.

Getting Python 3.13 on Ubuntu

If you attempt to install Python 3.13 from the standard repositories, you won’t have much luck. However, there is a repository you can use (if you can’t wait for your distribution of choice to add the latest version to the standard repos). Let me show you how to take care of this.

First, open a terminal window and install the solo dependency with:

sudo apt-get install software-properties-common -y

Once that’s taken care of, add the required repository with:

sudo add-apt-repository ppa:deadsnakes/ppa

When prompted, hit “Enter” on your keyboard.

After the repository has been added, you can then install Python 3.13 with the command:

sudo apt-get install python3.13 -y

You’re not out of the woods yet. At the moment, your system is probably still defaulting to Python 3.10, so you have to configure it to use 3.13. To do that, we’ll add both 3.10 and 3.13 as alternatives. First add 3.10 with:

sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1

Next, add 3.13 with:

sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.13 2

Finally, configure the default with:

sudo update-alternatives --config python

When prompted, select 2 and Python 3.13 is set. If you issue the command python -v, you should see that 3.13 is now the default.

To find out more about what’s been added to Python 3.13, make sure to check out the official release announcement.

The post How to Install Python 3.13? Use the Interactive Interpreter appeared first on The New Stack.

AWS Adds Support, Drops Prices, for Redis-Forked Valkey

Joab Jackson — Thu, 10 Oct 2024 20:41:16 +0000

Cloud giant Amazon Web Services has expanded support for Valkey, the Linux Foundation-backed open source fork of the previously-open source Redis key value data store.

In March, Redis restricted the license of its namesake in-memory data store, moving from an open source license to a Business Source License (BSL). A number of code contributors to the project, including those from AWS, quickly forked the data store into Valkey.

Valkey for ElastiCache and MemoryDB

On Thursday, AWS added support for Valkey 7.2 on two of its managed in-memory services, Amazon ElastiCache and Amazon MemoryDB. Both services already support Redis.

ElastiCache is a managed in-memory service, an easier-to-manage front end for either Redis and memcache, for those who need a real-time throughput for their apps.

Amazon likes to boast that during Prime Day 2024, ElastiCache served more than a quadrillion requests, with a peak of over 1 trillion requests per minute.

MemoryDB also offers microsecond-response caching capabilities and can store data via a permanent log.

Earlier this year, AWS released the Valkey GLIDE, a connector for bridging Redis and Valkey. It comes in Java, Python, and Node.js flavors.

Price Cuts for Valkey

AWS also cut costs of the Valkey packages so that they would be lower than their Redis equivalents.

ElastiCache Serverless for Valkey is priced 33% lower than ElastiCache Serverless for Redis, and 20% lower in the node-based ElastiCache.

The portions can be smaller too: Users can allot a cache with a minimum of 100MB cache, compared to the 1GB minimum for the Redis equivalent.

Likewise, MemoryDB for Valkey is priced 30% lower than MemoryDB on Redis.

Corey Quinn, chief cloud economist of the AWS-focused Duckbill Group, found strategic significance in the price differences between the Redis and Valkey offerings, noting it gave Redis customers an incentive to move to Valkey.

“It’s almost as if AWS discovered that Redis’ service margin was just taking up space in their massive bank vaults,” the analyst quipped.

AWS and Valkey

AWS contributed to the Redis software when it was open source, bequeathing fine grained access control over keys and commands, native hostname support for clustered configuration and partitioned channels for scalable pub/sub for version 7.

After Valkey went its own way, AWS contributed the first major version of Valkey, Valkey 8.0, helping out with a new I/O threading architecture and some memory optimization that reduces up to 20.6% of the memory overhead.

More Activity Around the Data Store

Last week, Google announced that it would be bringing vector processing to managed services for both Redis — called Memorystore for Redis Cluster — and for Valkey, Memorystore for Valkey 7.2.

In the meantime, Redis has released Redis 8, the now BSL-licensed version of the data store system. The new release supports seven new data structures (“JSON, time series, and five probabilistic types,” the news release states), has an updated query engine, and can now do vector search and geospatial queries.

The post AWS Adds Support, Drops Prices, for Redis-Forked Valkey appeared first on The New Stack.

Atlassian’s New AI Product Gives Developers Access to Agents

Loraine Lawson — Thu, 10 Oct 2024 20:00:05 +0000

Atlassian, the creators of Jira, released a generative AI product called Rovo that features pre-configured AI agents and the ability to create your own AI agents using natural language.

Rovo also has extensions for IDEs and GitHub for Copilot, allowing developers to access standards, edge cases, code or other information stored in enterprise systems like Slack, the Git repository solution BitBucket, and Atlassian’s knowledge management tool, Confluence, without leaving the developer workspace.

The AI solution has more than 20 pre-configured AI agents, and will introduce two of particular interest to developers: Auto Dev and Auto Review. Currently, the agents are in the lighthouse phase but Atlassian developers have documented a 75% time saving for simpler tasks, according to Atlassian’s Matt Schvimmer, a senior vice president who oversees product for the Agile and DevOps division of Atlassian.

It also incorporates enterprise-wide search and a chatbot similar to ChatGPT, both of which leverage semantic indexing and natural language. Semantic indexing helps provide context around the information so that users don’t have to find “magical keywords” to pull up the data.

Rovo is generally available now after a six-month private beta. However, the developer AI agents will launch within a year, pending the results from the lighthouse trial.

AI Agents for Developers

The Auto Dev AI agent can take an issue, generate a technical plan, and then generate code to resolve the issue. Developers can tweak the results at any step of the process.

“They can either make changes manually or ask the agent to regenerate things, and it gets you all the way to a PR,” Schvimmer said.

Auto Review reviews code and cleans it up for the developer prior to a code review. It currently supports BitBucket and GitHub and will eventually support GitLab, Schvimmer said.“This is a, basically, a virtual senior reviewer that can go in and give you suggestions on what needs to change before something can get approved,” he said. “More commonly, you’re going to use it to help speed up the time it takes you to review a PR.”

Developers and users can also create their own bots with or without code. Atlassian employees have created more than 500 custom agents internally.

Schvimmer, who began as a Cobalt programmer, created a “fluff eliminator” AI agent using natural language. It detects whether content from Confluence takes more than 10 minutes to read, and if it does, the AI agent will distill the page down into a summary and email the author to let them know the page is too verbose.

“It’s easy to do, and the other nice thing is that you can add [a workflow engine] underneath JIRA, so whenever some action is taken, an agent is triggered,” he said.

Another useful agent for development teams is the Issue Organizer, Schvimmer said.

An Atlassian AI agent works within BitBucket to review code. Screenshot via Atlassian’s blog.

“If you’re in a development team, oftentimes you’ll have a ton of different issues that are written,” Schvimmer said. “A bunch of these requirements are written that really should be aggregated to one epic — one big block of work. So the Issue Organizer finds all those pieces, suggests epics that should be managed together and allows you to group them into an epic and then schedule them into sprints, simplifying the time, putting light work together.”

Rovo for Everybody

Rovo is a standalone product that’s available to all users within an enterprise, not just Jira or Confluence users, at no additional cost to user organizations. Based on the beta, Atlassian estimates it’s saving 1-2 hours per week for users, according to a video introduction by Jamil Valliani, head of product for Atlassian Intelligence. He believes there’s more potential for savings.

It’s also designed to work with other popular software. For instance, a browser extension allows Rovo to function within Google Drive and Figma files on Google Drive accounts. It also integrates with SharePoint.

“Internally, we’ve been able to document a 75% time savings in terms of some of the simpler tasks.”
— Matt Schvimmer, senior VP, Agile and DevOps, Atlassian.

Overall, Atlassian plans to integrate the AI via connectors with nearly 80 popular SaaS apps over the next six months, allowing the AI to pull from those tools as well. It will have data center connectors starting next year with Confluence, Valliani added.

More than 80% of their data customers said it helped them find results in a timely manner, according to Schvimmer.

In addition to answering questions, Rovo can also provide context for the information. For instance, documents and other information used by people you collaborate with are prioritized higher in the search results. It also creates people and team knowledge cards so that developers and other users can learn about the teams and people on them, including what projects they’re working on and other contextual information.

Rovo Chat is enabled throughout the Atlassian suite. The chat understands the context you’re in, so if you’re on a particular Confluence page, it will focus on answering questions related to that context, Schvimmer said. It can also take actions such as creating a Jira issue or a Confluence page on command, he added.

Editor’s Note: Updated to correct Jamil Valliani’s first name.

The post Atlassian’s New AI Product Gives Developers Access to Agents appeared first on The New Stack.

OSI Finalizes a ‘Humble’ First Definition of Open Source AI

Heather Joslyn — Thu, 10 Oct 2024 19:00:41 +0000

After nearly three years of planning, including community meetings and a months-long global “roadshow” to gather feedback, the Open Source Initiative (OSI) has published Release Candidate 1 of its long-awaited definition for open source AI.

The document, published Oct. 2, includes definitions for four different kinds of data: open, public, obtainable and unshareable.

It also demands transparency from creators and sponsors of AI technology that bears the open source label, requiring that those creators share the data (if shareable), along with the source code used to train and run the system, and the model’s parameters.

What Release Candidate 1 doesn’t include: any attempt to address safety or risk limitations. Those concerns should be handled by governments, OSI Executive Director Stefano Maffulli told The New Stack.

“Governments around the world have different frameworks to understand what is acceptable risk, or what is ethical, sustainable, valuable,” he said. “All of these words, they come with trade-offs. It’s not our job to decide which ones they are.”

OSI’s goal in crafting the definition, he suggested, is to leave room in the definition for governments to act as they see fit. “We wanted to make sure that the definition was not going to be an impediment for that,” Maffulli said. “Otherwise, it will be failing on delivery, right?

‘No New Features, Only Bug Fixes’

OSI is continuing to gather feedback (on hackmd and on the OSI forum) about Release Candidate 1 and endorsements ahead of its planned launch at the All Things Open conference on Oct. 28 in Raleigh, N.C. There will likely be enough minor tweaks to justify a Release Candidate 2 ahead of the rollout, Maffulli said. But the intention is to start wrapping it up for now.

“With the release candidate cycle starting today, the drafting process will shift focus: no new features, only bug fixes,” reads a note from OSI on its website. “We’ll watch for new issues raised, watching for major flaws that may require significant rewrites to the text. The main focus will be on the accompanying documentation, the Checklist and the FAQ.”

However, Maffulli said, the definition will be a work in progress: “This is 1.0, but it’s a very humble 1.0. We’re not saying that this is done deal, we’re never going to look at it again and don’t bug us — like, drop the mic and go home.

“What’s going to happen is that we expect that 1.0 is going to be ready for use, which means that corporations, research institutions, academics, etc., deployers, users, can use it as a reference to start interpreting what they find on Hugging Face or something. They see a model, and they have now a reference.”

Maffulli added, “We’ve basically built something that is more of a manifesto than an actual working, 10-point checklist definition to evaluate legal documents. We’re very early, in very early stages, and that’s why it’s a humble 1.0 release.”

What’s in Release Candidate 1?

“Open Source means giving anyone the ability to meaningfully fork (study and modify) your system, without requiring additional permissions, to make it more useful for themselves and also for everyone,” reads the FAQ accompanying Release Candidate 1.

In line with that principle, the FAQ states open source AI is “an AI system made available under terms and in a way that grant the freedoms to:

Use the system for any purpose and without having to ask for permission.
Study how the system works and inspect its components.
Modify the system for any purpose, including to change its output.
Share the system for others to use, with or without modifications, for any purpose.

So, what’s in this near-final 1.0 version of the open source AI definition? Here are some key components:

Demands for Transparency

As previously stated, OSI’s Release Candidate 1 requires open source AI project creators to share the data information used to train the system, the complete code used to train and run the system, and the model parameters, “such as weights and other configuration settings.”

Will the level of transparency required for open source AI, under this definition, cause some creators of AI projects to keep them proprietary?

“That’s exactly what I think will happen,” Maffulli said. But, he added, this is also what’s happened to open source software more generally. “There are companies like Microsoft and Oracle, they don’t release the source code of their — call them ‘crown jewels,’ like Windows and Microsoft Office and the Oracle database.

“That source code is not available. It’s not transparent. And that doesn’t mean that open source is lost or anything like that. Just that it’s another part of the ecosystem, that you know exists.”

4 Different Categories for Data

The document’s FAQ section breaks data into four categories, noting that all four might be used to train a language model:

Open: “Data that can be copied, preserved, modified and reshared,” reads the FAQ.
Public: “Data that others can inspect as long as it remains available,” the FAQ described, noting that “this data can degrade as links or references are lost or removed from network availability.”
Obtainable: “Data that can be obtained, including for a fee.”
Unshareable: “Data that cannot be shared for explainable reasons, like Personally Identifiable Information.”

For data that falls into the “unshareable” category, the goal of enabling a “meaningful fork” of the technology is the guide:

“[T]he ability to study some of the system’s biases demands a detailed description of the data — what it is, how it was collected, its characteristics, and so on — so that users can understand the biases and categorization underlying the system. This must be revealed in detail so that, for example, a hospital can create a dataset with identical structure using their own patient data.”

The four categories reflect some messy reality that OSI encountered during its long period of research and community feedback.

When the process began, Maffulli said, the impulse was to insist that all three elements of an open source AI — data, code and parameters — be open source. “

But, he added, “Then you start looking a little bit deeper, and we found two main issues. One is on the parameters themselves, parameters weights. What are those things? From the law perspective, it’s not clear whether they have copyright or other exclusive rights on top. So, OK, big, big question mark goes on that box.”

And then, he said, there’s data: “Immediately there is an issue, OK, so maybe there is private data, there is copyrighted data, there is medical data, there is data that you can’t distribute — you can read it and make a copy of, but you cannot redistribute.”

It presented a conundrum. “To simplify the conversation,” Maffulli said, “we identified those four blocks.”

The FAQ acknowledges that data, and transparency around data, has been a perennial sticking point throughout the discussion that led to Release Candidate 1.

“Some people believe that full unfettered access to all training data (with no distinction of its kind) is paramount, arguing that anything less would compromise full reproducibility of AI systems, transparency and security,” the FAQ reads. “This approach would relegate Open Source AI to a niche of AI trainable only on open data … That niche would be tiny, even relative to the niche occupied by Open Source in the traditional software ecosystem.”

As data “gets more and more fine-grained and complicated,” Maffulli told The New Stack, “the definition itself, in its final form, provides for an escape route,” that accommodates differences in data and allows for more open source AI projects to emerge.

Large companies and organizations like OpenAI, “technically don’t have any obstacle to do whatever they want to do. They have no obstacle, neither technical nor legal, to use any of those four kinds of data for dev training.” But organizations with fewer resources to enter into commercial partnerships with data providers, he said, are at a disadvantage.

He added, “Either the definition open source would have to limit the availability of open source AI by excluding some of that kind of data, or we needed to provide a way for the public, and the open source communities in general, to have access to large language models, just like the large corporations can do it. And that’s what we’re doing.”

Clarification: This article has been changed from a previous version, to provide more context for Maffulli’s comment that the OSI open source definition offers “an escape route” in the way it categorizes types of data.

The post OSI Finalizes a ‘Humble’ First Definition of Open Source AI appeared first on The New Stack.

OpenTelemetry Challenges: Handling Long-Running Spans

Hazel Weakly — Thu, 10 Oct 2024 18:00:38 +0000

OpenTelemetry (OTel) has taken the observability landscape by storm, and for good reason! At some point in the last decade, the software world quietly started viewing protocols as standards, evolving them in the open and embracing community-driven open source. Riding on this momentum, OTel quickly grew into the second-highest velocity project in the CNCF ecosystem. With a focus on vendor neutrality and language interoperability, allowing engineers to focus on understanding their systems instead of debugging their debuggers, OTel’s success feels almost obvious in hindsight.

That said, for all the energy around OpenTelemetry, it’s not always a frictionless experience. There are some things that can be really challenging to address in OpenTelemetry’s mental models and assumptions. One of those huge hurdles to address in the real world is long-running spans.

Long … Running? What?

Long-running spans! Well, OK, I’ll back up a little bit and explain a few things. The OTel landscape can be overwhelming at first since it has so many concepts to know before you get started. When people talk about OpenTelemetry, they’re usually talking about distributed tracing. While I’ll focus only on the relevant bits, here’s a thorough overview if you’re interested.

When you debug a system, your first question is typically something like, “What action happened?” However, one action from the end user’s perspective translates to several from the system’s perspective. To reconcile that, OTel has the concept of a span, which is an action from the end user’s perspective. Inside that span, you have more spans that represent all the actions from the system’s perspective. We call the “span that represents the user viewpoint” a trace … usually.

However, the relevant part of a span for us is the fact that it contains a few things: an ID, a trace ID, a start time and an end time. When you ship off your little bundle of joy into your observability backend, it comes with all four pieces of information (and the actual data, too). But that means that spans have a duration, which has some profound implications.

It also turns out that, in practice, a lot of tooling really doesn’t want the length of a span to be longer than … well, not very long.

Why does the tooling care about this? There are a few big reasons! A huge one is that the OTel API specification states something very important: All spans must be ended, and this is the responsibility of the implementer. Further, it states that if a developer forgets to end the span, the API implementation may do pretty much anything with it. The software development kit (SDK) specification provides built-in span processors, but those operate only on finalized spans.

In other words, the user perspective is a span, and any incomplete spans will probably be lost forever. If the incomplete span happens to be the root span, all of the inner spans that were sent will appear orphaned, if the backend can even handle them existing at all. Practically speaking, it means that root spans that are longer than about five seconds are likely going to cause issues.

Another reason why tooling cares about this is sampling. When you send buckets of data from one place to another, it’s reasonable to ask how you can represent that data better, and maybe avoid sending some of it. That’s where sampling comes in. The sampling service takes the telemetry and decides whether or not to send it to the backend or not (plus some fancy math adjustments that make it all work out). Neato! Except, there’s a small problem: How does it decide when something is relevant to send or not? Sampling decisions have to work on a complete span, and often operate on an entire trace’s worth of spans. That doesn’t work if you lose the root span!

So, awkwardly, not only are incomplete spans probably lost forever, and not only are the most likely spans to be lost often the most valuable ones, but all of your cost, network and compute optimizations break. Ouch.

Have You Tried Not Having Long Spans?

A great solution to a problem is to fix it, but an amazing solution to a problem is to not have it! Can we … just not have long spans? It’s a noble thought, but it turns out that we’ll encounter this problem regardless of how long our spans are. We’ve been talking about long spans, but this is actually more about interrupted and incomplete spans.

The reason for that is that spans are basically the same as a database transaction in terms of their data model. So, whenever you run into a situation where you need to send transactions over the wire between multiple systems, you’ve encountered a scenario that experts like to call “being in for a real bad time.”

You could try a lot of solutions! Here’s a few that people have used:

Refactor your code to represent actions in smaller chunks.
Break a long action into intervals.
Make fewer traces and carry more data in the child spans.
Manually end the root span early.
Do unsound and sketchy transformations on your data to rewrite span IDs, trace IDs and links. (Repeat after me: “I solemnly swear to probably avoid this in production.”)
Unset the trace ID and utilize links instead.
Ensure your trace context is correct and you’re not inventing a root span on accident.

Unfortunately, none of those address the fundamental issue: When we said spans must be ended and we gave a duration, we made them transactions — and handling transactions across systems is hard.

To make matters worse, while you might think that interrupted spans don’t happen very often, it turns out that they happen quite frequently:

In the backend: Whenever an application restarts mid request, or crashes, or the network fails, or …
On the frontend: Whenever a web client navigates around, closes or refreshes a tab, cancels an action, or the browser event loop gets interrupted, or …
In mobile: All of the above and much more!

However, fortune favors the creative. Now that we know we’re really dealing with a transaction semantics problem (that just happens to look like a “don’t have long-running spans” problem), we can look at all the existing literature on this. Surely someone’s solved this — or, uhh, at least tried?

Creative Solutions to Chonky Spans

Putting our thinking caps and research glasses on, there’s a wealth of information surrounding databases, event streams and distributed transactions in general. However, there is a bit of a problem: Not much of that looks like OTel, and it’s hard to see how the solutions apply. But what if we stretched the definition of a span a little, and, given the constraints … cheated a tiny bit? Would that let us repurpose some solutions from other technology with similar constraints, maybe?

There are two frequently recurring themes in handling transactions: snapshots and write-ahead logs. In fact, logs as a data abstraction are one of the fundamental building blocks of distributed systems. Logs as an append-only ordered data structure end up being the perfect thing to build snapshots on top of, and it turns out that the span processors in the OpenTelemetry SDK can be thought of as an in-memory write-ahead log. OK, you’ve got to squint a bit to think of it like that, but really, it is.

Awesome! Not only do we have an industry adopted pattern for handling transactions in the form of logs, but we already have most of the pieces required to build snapshots! Snapshots won’t solve all of our problems, but it’s a massive improvement, and it makes partial data usable — which is invaluable for debugging.

So, uh, how do we do that?

First, we’ve got to reframe the process: Instead of sending spans to our backend, we’re writing spans to a log and then replicating that to the backend consistently.

So, uh, how do we do that?

Good question! It turns out that Embrace has implemented this solution and explained why they did so. As for the how, while log replication has a huge range of possible solutions, a simple one requires only a few small changes to both the client and the server.

First, the client has to send the snapshots of in-progress spans (this requires a custom span processor and exporter).
Second, the backend needs to process and store these and wait for them to be finalized.
Third, if those spans are never finalized, they still have to be massaged into an OTel-compliant shape gracefully and sent upstream. (OK, I lied. It’s not simple. We’re omitting a lot of details here.)

This seems like a lot of work, but Embrace’s SDK and backend does all of this for you, including handling the cases where interruptions occur and spans aren’t finalized. Even better, the spans are fully OTel-compliant when they’re done, which means there’s nothing stopping this solution from making its way into OpenTelemetry.

Tracer.getInstance().endSpan()

Whew! We covered a lot of ground here. First, we talked about what long-running spans are, why we run into them, why they’re a problem and how you can’t avoid them no matter how hard you try. In fact, not only are you going to run into them, but any type of situation that involves incomplete or interrupted spans is subject to many of the same failure modes, which we identified as a transaction semantics problem.

Luckily, it turns out that transaction semantics are a well-studied problem, and we were able to go over a great solution and introduce a sketch of how that might work with OpenTelemetry.

If you’re coming from a traditional backend-focused approach to observability, you’d be surprised just how fundamentally different the mobile environment is. Embrace has a helpful on-demand webinar if you’d like to learn more: What DevOps and SREs need to know about mobile observability. There’s also a helpful guide: Overcoming key challenges in mobile observability: A guide for modern DevOps and SRE teams.

Long-running spans are hard, transactions are hard, but embracing creative problem-solving to find useful answers is what observability is all about.

The post OpenTelemetry Challenges: Handling Long-Running Spans appeared first on The New Stack.

Start Securing Decentralized Clouds With Confidential VMs

Jonathan Schemoul — Thu, 10 Oct 2024 17:00:11 +0000

Despite constant improvements in encryption and cybersecurity, data breaches are becoming more frequent year after year. But it’s not just that black-hat and white-hat hackers are playing an endless game of tug-and-war. There are more basic, fundamental flaws with handling your data.

Your sensitive information, like your bank statements, medical records, and business IP, is typically encrypted when stored and in transit. However, the most vulnerable stage is when it’s being processed. Cloud data involves 82% of breaches, and 74% have a human element. This points to significant risks when data is actively used and processed, especially in cloud environments where people interact. Traditionally, data is unencrypted during computation, increasing risks.

Confidential computing allows data to be encrypted during this stage. Still, decentralized confidential computing takes it a step further, ensuring that no one (not even your cloud provider) can access your data while it’s being processed.

The Technology

One example of confidential virtual machine (VM) computing is AMD Secure Encrypted Virtualization (SEV). AMD SEV is a hardware-based memory encryption feature integrated into AMD’s EPYC processors. It encrypts the memory of individual VMs so that even the hypervisor cannot access the plaintext data. This is achieved by assigning each VM a unique encryption key, managed by the AMD Secure Processor—a dedicated security subsystem within the CPU.

The encryption process operates transparently to the VM. Memory reads and writes are automatically encrypted and decrypted by the memory controller using the VM-specific key. This ensures that data remains encrypted outside the CPU boundary, rendering any unauthorized access attempts futile. While still responsible for resource allocation and scheduling, the hypervisor cannot intercept or manipulate the VM’s memory content.

Another essential component is a Trusted Execution Environment (TEE). A TEE provides a secure enclave within the main processor, isolating code and data from the rest of the system. TEEs ensure that sensitive computations occur in a protected environment, shielded from malicious software and hardware attacks. In the context of AMD SEV, the TEE is realized through hardware-enforced isolation mechanisms that segregate VM memory spaces.

The AMD Secure Processor plays an essential role in establishing the TEE by handling cryptographic operations and key management. It ensures that encryption keys are generated securely and remain inaccessible to unauthorized entities, including the hypervisor and system administrators. This hardware root of trust underpins the integrity and confidentiality guarantees provided by SEV.

Decentralized Confidential VMs

Before transitioning to decentralized confidential virtual VMs, it’s worth considering why we need them in the first place. After all, confidential virtual VMs are already out there, and you may not see the issue in sharing your personal data with Google, Amazon, or Microsoft. Up to 90% of decentralized networks still use centralized infrastructure like AWS to run critical operations. There are a few issues with this approach, however.

One is that centralized models have a single point of failure. By decentralizing the infrastructure, you can eliminate single points of failure and reduce trust dependencies inherent in centralized systems. Aleph.im and TwentySix Cloud have implemented this, using AMD SEV to deploy fully decentralized confidential VMs across a distributed network. Researchers and developers can sign up for their decentralized cloud, and free access is granted to those who qualify.

Each node in the network runs VMs with memory encryption enforced by SEV, ensuring that data remains confidential even in a multitenant environment.

This approach benefits from blockchain principles, where decentralization enhances security and resilience. The combination of TEEs and decentralized consensus mechanisms allows the network to maintain integrity in the presence of adversarial nodes. Data and code within the VMs are protected from external interference, and the decentralized nature of the network mitigates the risks associated with centralized control.

Azure’s Confidential VMs vs. Decentralized Confidential VMs

Microsoft Azure also offers confidential VMs using Intel SGX (Software Guard Extensions) and AMD SEV technologies. Azure’s platform provides isolated execution environments within its cloud infrastructure, protecting data from other tenants. However, the infrastructure remains under Microsoft’s dominion, introducing trust assumptions regarding the provider’s security posture and governance.

In contrast, decentralized confidential VMs operate across a network of independently controlled nodes. The trust model shifts from relying on a single provider to a consensus among multiple participants. This decentralization reduces the risk of systemic vulnerabilities and coercion by external entities. Plus, it aligns with the Web3 user sovereignty and data ownership ethos.

Technically, Azure’s and decentralized confidential VMs leverage AMD SEV for memory encryption and isolation. However, the deployment models differ significantly. Azure’s solution is confined to its data centers, while decentralized VMs span heterogeneous environments. The latter introduces key management, attestation, and coordination complexities but offers enhanced resilience and trust decentralization.

Practical Applications and Implications

Let’s look at some of the real-world applications of decentralized confidential VMs. With all the coverage around privacy issues, you might have guessed the first application: AI.

Training machine learning models often involves sensitive datasets containing personal or proprietary information. Deploying training processes within decentralized confidential VMs allows organizations to harness distributed computational resources without exposing raw data. The models benefit from federated learning, aggregating insights without compromising individual data privacy. LibertAI, for example, is exploring how Aleph.im’s confidential VMs can be leveraged to run secure, large-scale AI deployments without exposing sensitive personal data.

More broadly speaking, any type of analytics on data can be risky. With decentralized confidential VMs, we can get those analytics-drawn insights without the privacy risks associated with unencrypted and/or centralized computation. Enterprises can perform analytics on encrypted datasets across decentralized nodes. Homomorphic encryption and secure multiparty computation techniques can be employed within confidential VMs, enabling insights without decrypting the data. This mainly benefits industries bound by strict regulatory compliance, such as healthcare and finance.

Another use case is secure decentralized finance (DeFi) protocols and trading algorithms. DeFi platforms can execute complex financial contracts and trading strategies within confidential VMs. Private keys and transaction data are processed in an encrypted memory space, preventing the leakage of sensitive information. This enhances security for automated trading bots and smart contracts handling significant financial assets.

Gaming and NFTs also benefit from this technology. Verifiable Random Functions (VRF) can be securely implemented within confidential VMs, ensuring fair and transparent generation of random numbers for game avatar traits, lotteries, and more.

Ultimately, as the world shifts towards greater transparency, security, and privacy, decentralized confidential VMs represent more than an incremental improvement — they offer a paradigm shift in secure computing. They empower businesses and individuals to protect their information unprecedentedly, ensuring no centralized entity can access their data.

The post Start Securing Decentralized Clouds With Confidential VMs appeared first on The New Stack.

Agents Shift GenAI From Order Takers to Collaborators

Deepak Singh — Thu, 10 Oct 2024 16:00:56 +0000

From the use of modular components to the well-defined rules and syntax of programming languages, the way we build applications makes software development an ideal use case for generative AI (GenAI). Therefore, it is no surprise that software development is one of the first areas being transformed.

While the industry has made great strides in a short stretch of time, we have only scratched the surface of what is possible with GenAI-powered coding assistants. What started as the ability to simply predict the next line of code is quickly evolving into something entirely new. In the future, AI-powered agents will drive the majority of software development, and we are just starting to see this shift take shape.

However, generative AI is as much a problem of science and technology as it is a problem of human interaction. While agents may take the tedium out of software development, developers’ roles are only becoming more vital as they orchestrate these agents and bring products to life.

GenAI-Powered Assistants: A Quick History

While GenAI-powered assistants are relatively new, they have already evolved by several generations. The first generation of software development GenAI tools was autocompletion for writing code. While AI autocompletion for email or text predictions can be hit or miss for everyday communications, it can be immensely helpful for writing code. These models are trained on a broad set of patterns and have deep understanding of how code works, far exceeding any one person’s knowledge. This first generation of coding assistants allowed developers to generate lines or entire blocks of code based on what they were typing.

That was a great start, but there was still more to do. Second-generation GenAI enabled developers to chat directly with the models. Once a developer could access these models in their integrated development environment (IDE), the next logical step was to broaden how they could use them by enabling direct conversation with the models. Instead of using prompts, developers could pose questions (such as, “How does this section of code work?” or “What is the syntax for declaring a variable in Python?”) and receive responses — just like if they were talking with a colleague over Slack. This kind of chat-based coding showed the potential of these tools to not only generate code but reason through a problem on the user’s behalf. While the developer still has to drive every interaction and wait for a response, that reasoning capability helped set the stage for the next shift, taking place right now.

The third generation of assistants — and the one that will truly reshape how software gets built — uses AI-powered agents to do much of the heavy lifting. Agents are goal-seeking, and they can accomplish goals almost entirely on their own.

How Today’s AI Agents Differ

While the use cases in the previous generations were one-way interactions, AI agents are more like collaborators or team members. A developer sets the goal, but the agent reasons through the request, shares a plan based on the specifications and executes it. Developers can iterate on the details of the plan, but much of the grunt work is left to the agent.

For example, a developer working at an e-commerce company could ask an agent to specify the desired outcome: “Write a feature that allows customers to save specific items to view later.” The agent then generates a step-by-step implementation plan. After the developer approves the change, the agent handles the rest, connecting multiple steps to create the code, test it and make necessary changes across the entire codebase. AI agents give every developer access to their own “team of engineers” who can do everything from upgrading applications to the newest language version to building entirely new features.

This could save teams months of undifferentiated work, and it is only the beginning. In the future, agents will handle more of the software development lifecycle, freeing developers to focus on where they can have the most impact.

Welcome to Spec-Driven Development

It all sounds great — but it raises an important question: Will AI agents sideline thousands of talented software developers? Not at all. If anything, the role of developers will become even more vital as they look to guide these agents from idea to production.

For decades, developer talent has often been mischaracterized as the ability to write code in arcane (and constantly changing) programming languages. Today, being an expert on a specific language may not be the most critical skill. Instead, developers will need much more experience in systems thinking and design. The ability to foster a deep understanding of a problem and translate that into a specification that a machine can understand will become a critical skill.

Depending on the problem, these specifications may be defined by a simple prompt or developed collaboratively as part of a larger plan with an agent. While working backwards from a problem to figure out where you’re going isn’t a new concept, the challenge becomes much more interesting in the GenAI era. With an eager AI agent just a prompt away, developers will need to be much more deliberate in articulating what they want and how they want it done to maximize an agent’s potential.

Perfecting the Last Mile

In transportation planning, the “last mile” is typically the last leg of the journey before something reaches its final destination. While the last mile may be the shortest part of the trip, it’s often the most complex — chock-full of obstacles, twists and turns that make reaching the end difficult.

Similarly, an AI agent may help developers swiftly move through the middle of the development process, but the hardest part comes at the end. Only a keen-eyed developer can discern if the end product produced by the agent actually meets the developer’s original goal.

Developers must ask: “Is this the application I want?” If the AI agent’s product meets the exact needs, the developer can start thinking about how to make it even better. If it falls short, the developer has a host of options. Some developers may want to dive deep into the code, making their own tweaks and optimizations to make the application truly stand out. Others may consult more agents and work through problems iteratively, chipping away to make their vision a reality.

There really is no wrong way to do it. And, with agents eliminating so much of the undifferentiated work in the middle, developers will have the time to perfect their own last mile to deliver something truly remarkable.

Moving Forward: Illuminate and Clarify

At Amazon, our principal engineers (some of our most senior and experienced technical employees) follow a tenet called “Illuminate and Clarify.” The core of that principle is about distilling complexity, boiling a problem down to its essence and driving a shared consensus for how to solve it.

The hallmark of software development in the age of AI agents will be very much the same. Because, ultimately, software development is about so much more than code. It’s about building systems that do what users want to accomplish.

The post Agents Shift GenAI From Order Takers to Collaborators appeared first on The New Stack.

CIQ Unveils a Version of Rocky Linux for the Enterprise

Steven J. Vaughan-Nichols — Thu, 10 Oct 2024 15:18:08 +0000

On Oct. 8, 2024, CIQ announced the launch of Rocky Linux from CIQ (RLC). This is an enterprise-grade version of the popular open source Rocky Linux distribution. This new offering aims to meet the needs of organizations that rely on Rocky Linux but require additional security, compliance, and support features for their enterprise environments.

If that sounds like Red Hat Enterprise Linux (RHEL), while CIQ wouldn’t put it that way, you wouldn’t be far wrong either. Rocky Linux, for those of us who don’t know this distro, is a revival of CentOS, the former RHEL clone. When Red Hat retired this iteration in favor of CentOS Stream, CentOS co-founder Gregory Kurtzer announced he’d create his own RHEL clone and CentOS replacement, which he named in honor of his late CentOS co-founder Rocky McGough.

Today, Rocky Linux comes in two versions: Rocky Linux 8.10, for the x86_64 and aarch64 architectures, and Rocky Linux 9.4, for the x86_64, aarch64, ppc64le, and s390x architectures. CIQ has offered technical support for Rocky Linux since 2021, and tech support is still available today. RLC is an addition to CIQ’s support offering, not a successor.

What RLC gives you, besides full compatibility with the community edition of Rocky Linux and RHEL, are the following enhancements:

Security and Compliance: The enterprise version includes verified packages, guaranteed security patches, and remediation of common vulnerabilities and exposures (CVEs) within specified service level objectives (SLOs).
Legal Protection: CIQ offers customers indemnification in the event of intellectual property disputes related to open source license compliance.
Supply Chain Validation: Verified packages are hosted in US-based repositories and mirrored globally to ensure proximity to customer data centers.
Private Repository Access: Subscribers gain access to CIQ’s private repositories for Rocky Linux packages, ISO images, and container images.

In short, as Kurtzer, who’s also the CEO of CIQ, explained in a statement, “Rocky Linux from CIQ meets the needs of organizations who want to run community Rocky Linux within their IT infrastructure but need contractual guarantees and mitigation to liabilities that the open-source community cannot provide. Now you can have the best of both worlds.”

Rocky Linux from CIQ is now available with an annual flat-rate subscription price of $25,000/year. It’s an interesting take: CIQ doesn’t count your environments or audit usage. You pay one price no matter how many instances you’re using or where you run them.

Of course, RLC isn’t the only RHEL-style Linux distro with business support around. Besides RHEL itself, Oracle has Oracle Linux. While the community RHEL Linux clone AlmaLinux OS doesn’t offer business support, companies such as OpenLogic and TuxCare offer tech support for the distro.

Still, RLC’s launch represents a significant step in Rocky Linux’s evolution as a viable business Linux option. It offers enterprises a robust, secure, and compliant option for their Linux infrastructure needs while maintaining the benefits of open source software.

The post CIQ Unveils a Version of Rocky Linux for the Enterprise appeared first on The New Stack.

How To Work With Date and Time in Python

Jessica Wachtel — Thu, 10 Oct 2024 14:14:15 +0000

We expect our applications and services to always be on time. Tasks like automation, data collection, scheduling, security and IoT integrations would look completely different without the confidence of precise timing. The world would look completely different if each developer built their applications and functions based on their watch. Fortunately, we have the system clock, which provides a universal reference across all programming languages and hardware. In Python, you can easily access this clock using the datetime module.

The datetime module references the system clock. The system clock is a hardware component in computers that tracks the current time. It counts the seconds since a fixed point known as the “epoch,” which is Jan. 1, 1970, on most systems.

Operating systems provide an interface for applications to access the system clock through system calls or APIs. These system calls and APIs return the current date and time. The accuracy and precision of this time depend on the hardware and the OS’s timekeeping mechanisms, but it all starts from the same place.

Python’s time interface is the datetime module. It calls system APIs to retrieve the current date and time.

How Does `datetime` Work?

To first work with dates and times, you’ll need to import the datetime module. The module will import all the methods and attributes of the datetime object into your application. Working with the datetime object will follow the object-oriented programming syntax.

View the code on Gist.

To get the current date and time, you can use the datetime.now() method. It will return the full datetime object with the current date and time down to the nanosecond.

View the code on Gist.

The format is: 2024-07-30 08:59:46.989846

You can also split this if you only need the date or only need the time. Calling the following two methods will extract more limited information from the datetime object.

To print today’s date, use the date.today() method:

View the code on Gist.

To pull just the current time for your application, you’ll have to extract the time from the datetime object.

View the code on Gist.

Formatting

You can reformat the dates and times as strings using the strftime() method. This allows you to specify your preferred format using format codes. Here’s a common format code:

– %Y updates the year

The following codes update the specified time as a zero-padded decimal number (for example, 01):

– %m updates the month
– %d updates the day
– %H updates the 24-hour clock
– %M updates the minute
– %S updates the second

A complete block of code that utilizes these format codes might look like this:

View the code on Gist.

Working With Time Zones

You can adjust the datetime object to reflect different time zones using the pytz library. Before you use it, you’ll need to import it:

View the code on Gist.

It’s not required that you get the UTC time first, but it is best practice because the UTC never changes (including during daylight savings time), so it’s a strong reference point.

View the code on Gist.

Python’s datetime module saves the date!

The datetime module simplifies working with timing in Python. It eliminates much of the complexity involved in synchronizing applications and ensures they operate with accurate, consistent timing.

The post How To Work With Date and Time in Python appeared first on The New Stack.

How AI Agents Are About To Change Your Digital Life

Denis Kuria — Thu, 10 Oct 2024 13:16:57 +0000

Imagine learning a new skill or understanding a complex concept, only to forget it entirely the moment you step away. Then when you need that knowledge again, it’s gone and you have to start from scratch. Frustrating, right? This lack of continuity would make it nearly impossible to build on your experiences or tackle increasingly complex tasks.

AI agents face a similar problem. They can process information, answer intricate questions and handle multistep workflows, but without a way to retain what they’ve learned, they start each interaction with a blank slate. For these agents to perform effectively, they need a memory system that allows them to recall and build upon past interactions. This is where vector databases come in. Milvus, an open source vector database created by Zilliz, enables AI agents to store, manage and retrieve high-dimensional data efficiently, giving them the memory they need to make smarter decisions and adapt over time.

Let’s delve into what AI agents are and how vector databases like Milvus enhance these systems to unlock their full potential.

Understanding AI Agents

AI agents are software entities designed to perform tasks autonomously. They are driven by complex algorithms and can interact with their environment, make decisions and learn from experiences. These agents are employed in various applications such as chatbots, recommendation systems and autonomous vehicles.

At their core, AI agents operate through a cycle of perception, reasoning, action, interaction and learning.

Structure of an intelligent agent

Perception

The process begins with AI agents gathering information from their surroundings through sensors or user inputs. For instance, a chatbot processes text from a conversation, while autonomous vehicles analyze data from cameras, radar or lidar sensors. This gathered data forms the agent’s perception of its environment, setting the stage for informed decision-making. The accuracy of this perception is crucial as it significantly impacts the quality of subsequent actions and interactions.

Reasoning

Once data is collected, AI agents process and analyze it to derive meaningful insights. This stage involves using large language models or rule-based systems to interpret the input, identify patterns and contextualize the information. The reasoning process is also influenced by the agent’s world-knowledge memory, allowing it to leverage past experiences for improved decision-making. For example, in a recommendation system, the agent analyzes user preferences and behavior to suggest relevant content. Reasoning is critical for understanding the environment and predicting the consequences of potential actions.

Action

Following the reasoning phase, the agent takes action based on its analysis. This might involve responding to a user query in a chatbot, suggesting a product in an online store or making a steering adjustment in an autonomous vehicle. The actions are not isolated events; they are direct outputs of the agent’s reasoning process. Effective actions rely on accurate perception and sound reasoning to ensure the agent can perform its intended tasks successfully.

Interaction

Beyond singular actions, AI agents often engage in continuous interaction with their environment and users. Interaction is a more dynamic form of action where the agent repeatedly exchanges information with the external world. This ongoing dialogue allows the agent to refine its understanding and adjust its behavior in real time. For instance, in a conversational AI, the interaction involves maintaining context over multiple exchanges, adapting responses based on user feedback and providing a coherent experience. This iterative exchange is crucial for environments that change frequently or require complex decision-making over time.

Learning

Learning distinguishes AI agents from traditional software. After taking action and interacting with the environment, the agent evaluates outcomes and adapts its future behavior. This learning process is driven by feedback loops, where the agent learns from its successes and failures. By integrating the knowledge memory, the agent continually updates its understanding of the environment, making it more adept at handling new and unexpected scenarios. For example, an autonomous vehicle improves its navigation by analyzing previous driving conditions, and a recommendation system refines its suggestions based on user feedback. This continuous learning cycle ensures that AI agents become more effective and intelligent over time.

While these stages outline the fundamental workings of an AI agent, their true potential is unlocked when they can store and retrieve knowledge in the long term, enabling them to learn from past experiences and adapt. This plays a pivotal role in enhancing these agents’ memory and decision-making capabilities.

How Vector DBs Empower AI Agents

Vector databases (DBs) are specialized databases optimized to handle high-dimensional vectors, which are numerical representations of complex data like text, images and audio. Unlike traditional databases that store structured data, vector DBs store vectors to facilitate similarity searches, which is essential for tasks like information retrieval and recommendation. Milvus is an open source vectorDB designed specifically for these requirements, providing a scalable and efficient solution. It is the most popular vector database in terms of GitHub stars.

Vector DBs like Milvus serve as a memory system for AI agents, enabling them to handle vast amounts of high-dimensional data efficiently. It’s important to note that not all vector DBs are the same. It’s important to pick one with comprehensive search features and that is highly scalable and performant. Vector DBs with these types of features, such as Milvus, are key to building more intelligent AI agents.

Building Long-Term Memory

Agents rely on long-term memory to retain information and context across interactions. They must have access to an efficient way to store and retrieve semantic data:

Efficient indexing: Indexing techniques like HNSW (Hierarchical Navigable Small World) allow agents to quickly find relevant information. These techniques help navigate high-dimensional spaces swiftly, enabling agents to pull up the right information without delay.
Flexible schema: Agents often need to store additional metadata alongside their vector data, such as the context or source of the information. A dynamic schema design like what Milvus offers allows the addition of metadata to each vector flexibly. This enriches the agent’s memory, offering a fuller picture of stored knowledge.

Enhancing Context Management

For agents to maintain coherent interactions, they must efficiently retrieve relevant data.

Approximate nearest neighbor (ANN) search: ANN algorithms find vectors most similar to a given query. This quick retrieval of relevant data allows agents to provide informed and context-aware responses, crucial in dynamic environments.
Hybrid search capabilities: Context isn’t just about similarity; sometimes, agents need to consider specific attributes alongside semantic relevance. Hybrid searches that combine vector similarity with scalar filtering give agents the flexibility to fine-tune their information retrieval, ensuring more precise outcomes.
Real-time search: Agents need access to the most current information. Real-time data insertion and near real-time search ensures that agents are always working with up-to-date knowledge, making their responses more accurate and relevant.

Ensuring Scalability and Performance

As agents scale in complexity and data volume, their underlying memory system must handle this growth without sacrificing performance.

Distributed architecture: A distributed architecture divides tasks and data across multiple machines, or nodes, that work together as a single system. This setup allows horizontal scaling, meaning you can add more nodes to handle increasing data or query loads. For AI agents, this distributed setup ensures they can manage large volumes of data without slowing down. For example, if an AI agent needs to process billions of pieces of information, this data can be distributed across multiple nodes, maintaining fast response times and avoiding bottlenecks.
Load balancing and sharding: Load balancing distributes workloads evenly across different servers or nodes, preventing any single machine from becoming overwhelmed. Sharding is the process of breaking up large data sets into smaller, more manageable pieces called shards. A shard is a horizontal data partition in a database. Using both techniques optimizes the vector database’s performance. When data and query workloads are spread evenly across the cluster, each machine only has to handle a portion of the work, which increases efficiency. This is particularly important for agents that need to process large data sets quickly. By breaking the data into shards and distributing them, queries can be processed in parallel, making operations faster and smoother.
High throughput and low latency: Throughput measures how many queries a system can handle in a given time, while latency is the delay before the system responds to a query. For applications that require instant responses — such as chatbots, search engines or recommendation systems — high throughput and low latency are crucial. Milvus is designed to handle thousands of queries per second (high throughput) and return results within milliseconds (low latency), even when working with billions of vectors. This allows AI agents to provide real-time responses to users, making them suitable for applications that need quick, on-the-fly decision-making.

Practical Applications of Milvus-Enabled AI Agents

Combining scalable performance and seamless data retrieval creates a powerful tool for a variety of industries. Here are some practical applications where Milvus-enabled AI agents can thrive:

Conversational AI and Customer Support

These conversational AI agents can retain context over long interactions, making them more effective in customer support roles. Traditional chatbots often struggle to maintain coherent conversations beyond a few exchanges. A vector database-enabled AI agent can store and retrieve previous interactions, enabling it to understand ongoing conversations and provide more personalized responses.

Example: Consider an AI agent deployed by an e-commerce platform. A customer contacts the support team regarding a product issue. The AI agent recalls the customer’s previous interactions, such as past purchases, previous support tickets and chat history. This memory allows the agent to provide context-aware assistance, such as troubleshooting steps tailored to the customer’s situation or offering product recommendations based on their purchase history.

Personalized Content Recommendations

These AI agents can provide personalized content recommendations by analyzing user behavior and preferences. By storing user interactions as vectors, these agents can match current behavior with past patterns to recommend articles, videos, products or other content.

Example: A streaming service uses an AI agent to recommend shows to its users. When a user watches a series, the AI agent generates vector embeddings representing the show’s features (genre, actors, themes) and the user’s interaction patterns. Over time, the agent learns the user’s preferences and compares new content to the stored embeddings. If the user enjoys thrillers with a certain actor, the agent can identify and recommend similar content, enhancing the user’s viewing experience.

Fraud Detection in Financial Services

In financial services, these types of AI agents can detect and prevent fraud by analyzing large volumes of transaction data. By converting each transaction into a vector that captures key attributes, such as transaction amount, location and time, agents can identify patterns and flag anomalies in real time.

Example: A bank employs an AI agent to monitor transactions for signs of fraud. The agent stores vectors representing normal transaction patterns for each customer. If a transaction significantly deviates from these patterns — such as a large withdrawal in a foreign country shortly after a similar transaction locally — the agent can quickly retrieve this information and flag the transaction for review. By doing so, the agent helps reduce false positives and identifies genuine threats promptly.

Autonomous Vehicles and Navigation

AI agents in autonomous vehicles process and interpret sensory data from the vehicle’s environment. By storing vector embeddings of objects, road conditions and previous navigation routes, the Milvus-enabled agent can make informed decisions in real time.

Example: An autonomous vehicle uses an AI agent to navigate city streets. The vehicle’s sensors constantly feed data into the agent, which generates vectors representing various elements like road signs, pedestrians and obstacles. The agent compares this incoming data with stored embeddings of known scenarios to make split-second decisions. For instance, if the agent recognizes a complex intersection it has navigated before, it can recall the optimal route and driving behavior, improving both safety and efficiency.

Conclusion

Vector databases like Milvus are crucial in building intelligent AI agents. They provide a powerful memory system capable of storing, searching and retrieving high-dimensional data. They also enable AI agents to handle complex tasks, offer personalized interactions, and adapt to changing environments through efficient similarity search and continuous learning.

As AI agents continue to evolve, vector databases’ role in supporting advanced applications will only grow. By leveraging their capabilities, you can build AI agents that are not only intelligent but also contextually aware and adaptable. Visit the Zilliz GenAI Resource Hub to learn more.

The post How AI Agents Are About To Change Your Digital Life appeared first on The New Stack.

Rust’s Expanding Horizons: Memory Safe and Lightning Fast

Darryl K. Taft — Thu, 10 Oct 2024 12:00:33 +0000

The Rust programming language continues to be one of the top 15 programming languages and has been the most admired language among developers for nine consecutive years, according to various studies.

In an interview highlighting the Rust language’s growing importance in the programming world, Joel Marcey, director of technology at the Rust Foundation, discussed some of the various initiatives to improve its security, performance and adoption across different domains.

Marcey has had a varied career in software engineering, including roles at Intel, Microsoft, and Facebook (now Meta), before joining the Rust Foundation. In this episode of The New Stack Makers podcast, which I hosted, I asked him about issues such as Rust’s expanding use cases. He noted that while Rust is known for systems and backend programming, it’s gaining traction in embedded systems, safety-critical applications, game development and even the Linux kernel.

This podcast was recorded on the heels of the RustConf 2024 conference, held in Montreal in September. Marcey’s key takeaway from the event was that “the Rust community is as motivated and vibrant as ever to see Rust continue to grow and succeed as a broad, first-class programming language for developers.”

A Safe and Fast Language

Rust is a really safe and fast systems language, Marcey said. And you can use Rust on the web with a WebAssembly (Wasm) backend. Wasm allows Rust to be used in web applications, though adoption is still in the early stages.

Meanwhile, Marcey addressed the Rust versus Go conundrum, stating that while both languages focus on memory safety and efficiency, Rust may have an edge in performance-critical applications.

“There are many similarities between the two languages, which allows for either one to be the right choice, given your use case,” Marcey said. “So I think you can consider both general-purpose programming languages that focus on memory safety and produce really efficient code. But if you want to speak in terms of Rust, if you care about eking out every last bit of performance and speed, and you know that speed of execution matters above anything else, you may want to give it a look.”

Recent Developments

Moreover, Marcey discussed recent Rust developments, including Rust 1.81, which introduced new sort implementations. He also discussed some of the project goals for 2024, which include a new edition, async improvements and enhancing Rust for Linux.

The podcast also looks at government adoption of Rust, including how the U.S. federal government is considering Rust, with the Defense Advanced Research Projects Agency (DARPA) announcing an initiative to translate C code to Rust – known as Translating All C to Rust TRACTOR.

Also, Marcey talked about Rust-C++ interoperability: Google has funded a project that aims to improve interoperability between Rust and C++.

The conversation also covered the Rust Security Initiative, which aims to maintain Rust as one of the most secure development platforms, with projects like Painter and Typomania addressing ecosystem security. Macey also mentioned the Safety-Critical Rust Consortium, a new group dedicated to the responsible use of Rust in critical and safety-critical software.

Check out the full episode for more insight about Rust.

The post Rust’s Expanding Horizons: Memory Safe and Lightning Fast appeared first on The New Stack.

Buildkite Expands Scale-Out Continuous Delivery Platform

Joab Jackson — Wed, 09 Oct 2024 20:22:57 +0000

Buildkite Pty Ltd has expanded its namesake concurrency-minded continuous integration and delivery software to make it a full-fledged platform, adding in a test engine, a package registry service and a mobile delivery cloud.

Launched a decade ago, by now Buildkite CEO Keith Pitt, the software was designed to run concurrently, allowing users to run a hundred times as many agents compared to traditional build pipelines.

As a result, the software found use in many scale-out companies, being put to work at Airbnb, Canva, Lyft, PagerDuty, Pinterest, PlanetScale, Shopify, Slack, Tinder, Twilio, Uber and Wayfair, among others.

On TNS, we’ve documented how Equinix uses Buildkite to update the many OSes it supports on its bare-metal cloud.

Pitt created the software when he was a developer, working with Heroku and a git code repository.

“Heroku was a magical platform. Heroku did something that no other platform did, and then they cared about the developer experience,” Pitt said. The company he worked at mandated the use of Jenkins, which was, at the time, difficult to work with, especially when accessing assets remotely.

“I needed a different approach to do my job,” he said.

Overall, software built on the software is used by more than a billion people per day, according to the company.

According to a statement from Uber Engineering Manager Shesh Patel, the ride-share giant had cut its build times in half by switching to Buildkite. “Adopting a delivery-first mindset has been crucial to our ability to grow,” he asserted.

How Buildkite Differs From Other CI/CD Systems

Buildkite is different from other CI/CD software and service providers in two major ways, Pitt claims. One is that it is built to run concurrently, supporting the ability to run multiple jobs at the same time. Another is that it doesn’t charge by the build minutes or number of concurrent jobs, two widely-used billing methods in the CI/CD space.

Instead, Buildkite offers a per-seat, unlimited-use pricing model.

In many cases, organizations are tied to “legacy DevOps tools” that tie them to slowed build cycles, noted Jim Mercer, program vice president of Software Development DevOps and DevSecOps at IDC, in a statement.

To illustrate why speeding continuous integration is so important for scale-out companies, Pitt offered an example: A company like Uber may have 5,000 developers. At the start of the workday, the majority of those developers will start making code commits more or less simultaneously. With Uber’s complex codebase of 50 million lines of code or more, each change may kick off up to 50,000 separate tests. Multiply that by the 5,000 changes, and a build system may be managing hundreds of millions of events simultaneously.

“You can’t run one test after another. Otherwise, it would take weeks, months, years, even some cases, to run tests sequentially,” Pitt added. “So you have to paralyze you have to run them concurrently.”

The software, available as open source, can be easily duplicated to run as many build workflows as needed.

Developers define the steps, or a pipeline, that a set of code should go through before being placed into production, which may involve unit and integration tests, as well as other checks. Each of the steps is handled by build runner agents, written in the Go programming language so they can be run on different platforms. Each agent polls Buildkite’s agent API over HTTPS. Outputs are stored and reused as artifacts.

Another issue in this space that impedes the ability to scale out continuous integration is how customers are billed, Pitt noted.

“A lot of the other players in this space are incentivized not to make you go faster, because their main revenue streams are from compute,” Pitt said. “They resell electricity, so they’ve got no raw incentive for you to go faster.”

As an alternative, Buildkite charges per active user, which gives the company to drive down workflow times to as close to zero as possible through concurrency.

Buildkite runs on a hybrid architecture, meaning it uses the customer’s compute capabilities, while the company runs operations on its own cloud-based control plane (Pitt calls this approach the Bring Your Own Cloud [BYOC]). BuildKite itself has no access to the code itself (a true security benefit for many organizations).

How Buildkite Has Been Expanded

For the new release, Buildkite expanded its BYOC format to package registries, offering a high-performance asset management service with rapid indexing and enhanced security features. The customer provides the storage and Buildkite provides the management.

The company has also ramped up its own cloud environment for running mobile applications on behalf of clients, based (unlike other Buildkite offerings) on per-usage pricing. It is ideal, Pitt said, for organizations that don’t want to manage the complicated logistics of mobile application delivery.

The post Buildkite Expands Scale-Out Continuous Delivery Platform appeared first on The New Stack.

Checks by Google: AI-Powered Compliance for Apps and Code

Loraine Lawson — Wed, 09 Oct 2024 18:00:37 +0000

Google is making its AI-powered compliance platform generally available as Checks by Google. It encompasses three new product offerings that will check apps, code and soon AI models for compliance issues, including personally identifiable information, government regulatory requirements, and whether a developer model will “talk out of turn” by providing controversial or inappropriate responses.

Checks by Google emerged from Google’s incubator and was used internally to test Google’s own large language models, said Fergus Hurley, the co-founder and general manager of Checks by Google.

“We’re providing insights and tools to these companies because most of them do not have insights and tools that they need,” he said. “Some of our customers include the top five social media apps, the top five games, the top five finance apps, and these companies are very well-resourced, but they just don’t have the level of insight that they need to get through their job. We bridge that gap [between] the development team and the compliance team.”

Checks by Google is integrated with Vertex, the Google Cloud offering for generative AI, but works with other major model providers as well. Vertex has more than 150 generative AI models available, including Anthropic’s Claude and Mistral.

App Compliance

Checks by Google has three offerings: App Compliance, which is available now, and Code Compliance and AI Safety, which are in a closed beta with a waitlist.

App Compliance checks an app or website or service to see if it’s complying with rules for collecting user data. For example, it can check for compliance against the GDPR in Europe, the CCPA in California and the LGPD in Brazil.

“We look at what the app is required to do based on the different regulations around the world,” Hurley said. “We cover these rules and if you have users in those regions, we turn on those checks.”

App Compliance can also run an analysis on a publicly used app to check against any organizational privacy policies. It relies on an LLM that’s fine-tuned for understanding privacy policies, comparing them to what the app or product is actually doing and performs both dynamic and static analysis, he said. For example, Checks runs the apps on actual physical devices and monitors the network traffic coming off the apps as well as the user experience.

“We’ve got a smart AI crawler looking at what the app is actually doing. It’s able to play games, it’s able to log into the app if the login credentials are provided to us, but it’s a smart AI crawler.”
— Fergus Hurley, co-founder and general manager, Checks by Google

“We have built our own fine-tuned model for understanding privacy policies, and that’s used across many teams at Google now as well,” he said. “We are able to identify issues that the product might have from a compliance perspective. We’ve got a smart AI crawler looking at what the app is actually doing. It’s able to play games, it’s able to log into the app if the login credentials are provided to us, but it’s a smart AI crawler.”

Hurley noted that co-founder Nia Cross Castelly is a lawyer, although Hurley cautioned that the AI is not providing legal advice.

“She was responsible for Google Play policies, which developers follow very closely to get access to billions of users,” he said. “So we do have a lawyer overseeing this stuff, but we are not providing legal guidance. That’s important.”

Instead, the AI is simply providing insights and tools that bridge the gap between the development team and the compliance team, he added.

Code Compliance

Code Compliance, which is in closed beta, allows developers to get ahead of regulatory issues before an app is published. It can be integrated into an IDE, Hurley said, so that developers receive alerts directly in the IDE about issues that are integrated into their build systems.

Code Compliance provides information about critical issues, which can include security issues, but it also might detect, for example, an outdated SDK.

“We also help people create, manage and maintain their safety labels on Google Play,” he said. “That’s the Holy Grail, being able to be that one place that people need to go to be able to get all their compliance insights.”

AI Safety

The third offering, which is being tested in a closed beta, is AI Safety.

Developers have three main problems they need to solve with AI and AI-driven apps, Hurley said. First, they need to be able to set up their policies, which he called the ‘align phase.’ During this phase, they determine what policies are relevant to them and their website or app.

Second, they need to evaluate to make sure that the policies are adhering to their actual initial model release and vice versa. Third, after launching the actual GenAI product, developers need to ensure it’s behaving correctly in the wild.

“We have built a product to help with each of those parts as part of this AI safety product, so really, it’s trying to build out the governance command center here,” he said. “In the first phase, the alignment, people want to be able to configure their policies, and right now, we support the core policies that really every generic product needs around violence, hate speech, and sensitive data — such as PII (personally identifiable information).”

The evaluation phase helps developers determine whether the policies adhere to their initial model release and vice versa, he said. This phase is where the product performs red teaming and adverbial testing using prompts developed in-house by Google, he said.

Screenshot of the Checks by Google framework

Red teaming is a security testing technique that involves simulating malicious attacks on a system or organization to identify vulnerabilities and assess resilience. The name comes from a controlled war game where a red team attempts to breach the security of a target, which a blue team defends. Adversarial testing is a type of software testing that involves intentionally trying to break an application or system by introducing unexpected or malicious inputs. It helps find weaknesses that could be exploited by malicious actors.

“One thing developers struggle with once they’ve built their model is coming up with these adversarial prompts, and that’s where us having this huge corpus of adversarial prompts is so valuable, and that’s where we’re leaning into a lot of the work being done by Gemini and Gemma and other Google teams as well,” he said, adding that it also incorporates the best practices developed by those teams. “We run those prompts against the developer model, and we make sure the responses that form that model are the ones that the developer would want.”

This step shouldn’t be underestimated. Not only can AI models produce statements that could be publicly embarrassing, but they can also provide incorrect information that costs companies money.

“People sometimes try and get the models to do things that they shouldn’t do,” he said. “One of the most public cases of this is where Air Canada had a case where their agent responded and said that the person could get a refund. What ended up happening is that Air Canada said, ‘No, you can’t get a refund based on these conditions.’”

That led to a court case and in the end, Air Canada did have to issue the refund, he said.

“It’s now sort of set the rules that companies are responsible for what their GenAI agents say and do, and they do have that responsibility,” he said. “Making sure that the agent is following the company’s policies is so critical, but that’s one of the most public examples of one of the things that will be prevented by the model being fine-tuned, based on the company’s policies, to only talk about what the company actually offers.”

“One things developers struggle with once they’ve built their model is coming up with these adversarial prompts, and that’s where us having this huge corpus of adversarial prompts is so valuable, and that’s where we’re leaning into a lot of the work being done by Gemini and Gemma and other Google teams as well.”
— Fergus Hurley

Third, once the GenAI product is released, developers need to be able to monitor whether it behaves correctly in the wild. For instance, there was a case of a general-purpose AI agent that a company launched to provide for a specific use case, but people learned they could “hack” it and gain free access to a model that’s actually very expensive to use.

“The most important part is just making sure that things don’t go off the rails and having those safeguards in place,” he said. “We have a guardrails product where it monitors the input prompt and the output prompt and detects issues, like an input prompt that should not reach the model would be when someone is trying to potentially jailbreak the model, and then on the output side of things, no PII should ever be exposed by a model, and there should be many safeguards in place to prevent that from happening.”

He pointed out that if developers do not want to perform the fine-tuning themselves, they can use guardrails to automatically create monitoring and safeguards.

“They are able to go in and select for the input guardrails, [e.g.] I want to turn on jailbreak prevention,” he said. “Let’s say they’ve configured these different ones with different sensitivity thresholds, then they can configure the output thresholds, and then they deploy that to production as part of their model.”

Guardrails act as a filter on the model rather than actually fine-tuning it directly. The filter does not significantly impact performance, he added.

Customizable by Industry, Focus

Checks by Google’s offerings can be customized by industry as well, he added. For instance, if an app is dealing with children’s data, there are very specific rules that app must follow.

“We work with some of the biggest game companies out there for kids,” he said. “You could imagine healthcare and finance or other major heavily regulated industries have their own very specific needs, and that’s where the goal, over time, is to build out this ecosystem of checks, where people are able to turn on the Checks that are most relevant to their business.”

Hurley said Checks by Google’s products cut across all types of developers; they see frontend developers, as well as backend and full stack developers, using the tools.

Right now, it’s free to get started, although some businesses are more complex and do need paid services to help them with compliance, he said.

The post Checks by Google: AI-Powered Compliance for Apps and Code appeared first on The New Stack.