• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Cisco Umbrella

Enterprise network security

  • Contact Sales
  • Login
    • Umbrella Login
    • Cloudlock Login
  • Why Us
    • Why Cisco Umbrella
      • Why Try Umbrella
      • Why DNS Security
      • Why Umbrella SASE
      • Our Customers
      • Customer Stories
      • Why Cisco Secure
    • Fast Reliable Cloud
      • Global Cloud Architecture
      • Cloud Network Status
      • Global Cloud Network Activity
    • Unmatched Intelligence
      • A New Approach to Cybersecurity
      • Interactive Intelligence
      • Cyber Attack Prevention
      • Umbrella and Cisco Talos Threat Intelligence
    • Extensive Integrations
      • IT Security Integrations
      • Hardware Integrations
      • Meraki Integration
      • Cisco Umbrella and SecureX
  • Products
    • Cisco Umbrella Products
      • Cisco Umbrella Cloud Security Service
      • Recursive DNS Services
      • Cisco Umbrella SIG
      • Umbrella Investigate
      • What’s New
    • Product Packages
      • Cisco Umbrella Package Comparison
      • – DNS Security Essentials Package
      • – DNS Security Advantage Package
      • – SIG Essentials Package
      • – SIG Advantage Package
      • Umbrella Support Packages
    • Functionality
      • DNS-Layer Security
      • Secure Web Gateway
      • Cloud Access Security Broker (CASB)
      • Cloud Data Loss Prevention (DLP)
      • Cloud-Delivered Firewall
      • Cloud Malware Protection
      • Remote Browser Isolation (RBI)
    • Man on a laptop with headphones on. He is attending a Cisco Umbrella Live Demo
  • Solutions
    • SASE & SSE Solutions
      • Cisco Umbrella SASE
      • Secure Access Service Edge (SASE)
      • What is SASE
      • What is Security Service Edge (SSE)
    • Functionality Solutions
      • Web Content Filtering
      • Secure Direct Internet Access
      • Shadow IT Discovery & App Blocking
      • Fast Incident Response
      • Unified Threat Management
      • Protect Mobile Users
      • Securing Remote and Roaming Users
    • Network Solutions
      • Guest Wi-Fi Security
      • SD-WAN Security
      • Off-Network Endpoint Security
    • Industry Solutions
      • Government and Public Sector Cybersecurity
      • Financial Services Security
      • Cybersecurity for Manufacturing
      • Higher Education Security
      • K-12 Schools Security
      • Healthcare, Retail and Hospitality Security
      • Enterprise Cloud Security
      • Small Business Cybersecurity
  • Resources
    • Content Library
      • Top Resources
      • Cybersecurity Webinars
      • Events
      • Research Reports
      • Case Studies
      • Videos
      • Datasheets
      • eBooks
      • Solution Briefs
    • International Documents
      • Deutsch/German
      • Español/Spanish
      • Français/French
      • Italiano/Italian
      • 日本語/Japanese
    • Security Definitions
      • What is Secure Access Service Edge (SASE)
      • What is Security Service Edge (SSE)
      • What is a Cloud Access Security Broker (CASB)
      • Cyber Threat Categories and Definitions
    • For Customers
      • Support
      • Customer Success Webinars
      • Cisco Umbrella Studio
  • Trends & Threats
    • Market Trends
      • Hybrid Workforce
      • Rise of Remote Workers
      • Secure Internet Gateway (SIG)
    • Security Threats
      • How to Stop Phishing Attacks
      • Malware Detection and Protection
      • Ransomware is on the Rise
      • Cryptomining Malware Protection
      • Cybersecurity Threat Landscape
      • Global Cyber Threat Intelligence
    •  
    • Woman connecting confidently to any device anywhere
  • Partners
    • Channel Partners
      • Partner Program
      • Become a Partner
    • Service Providers
      • Secure Connectivity
      • Managed Security for MSSPs
      • Managed IT for MSPs
    •  
    • Person looking down at laptop. They are connecting and working securely
  • Blog
    • News & Product Posts
      • Latest Posts
      • Products & Services
      • Customer Focus
      • Feature Spotlight
    • Cybersecurity Posts
      • Security
      • Threats
      • Cybersecurity Threat Spotlight
      • Research
    •  
    • Register for a webinar - with illustration of connecting securely to the cloud
  • Contact Us
  • Umbrella Login
  • Cloudlock Login
  • Free Trial
Spotlight

Docker Container Scheduling as a Bin Packing Problem

Author avatar of Philip ThomasPhilip Thomas
Updated — October 15, 2020 • 3 minute read
View blog >

hackathon
For the internal OpenDNS engineering hackathon earlier this month, I used data from our Quadra system to develop a Docker container scheduler. The tool combines historical data about container resource consumption with a mathematical model to best decide which host should run each container. This formulation is a type of bin packing problem, and I used the JuliaOpt project’s JuMP package to formulate the solution in the Julia programming language.

Problem Statement

Given pools consisting of a quantity of identical Docker containers, each with known historical memory, CPU, and network I/O usage, and given a finite number of hosts each with known memory, CPU, and network I/O capacity – assign each container in a pool to a single host. The sum of memory and network I/O of the containers on the host must be less than its capacity. Because it is the main resource constraint for hosts, minimize the expected CPU usage on each host. In addition, attempt to keep different containers from the same pool on different hosts to create redundancy.
Screen Shot 2015-05-06 at 12.57.38 PM

Data

Collecting historical data about the resource usage of each container proved to be the most time-consuming portion of the process. The Quadra system logs resource consumption for each container in InfluxDB. So, my hackathon project starts by using this API to get a list of currently-running containers and their resource consumption over the last hour. Data about running Docker containers is accessible through the Docker Stats API introduced in v1.5.0 or by using cAdvisor.
Resource consumption by containers tends to be highly variable. For some pools, such as web apps, traffic changes proportional to web visitors throughout a day. For pools used by headless browser tests, resource consumption is often low with periodic bursts in CPU. Staging environments use comparatively little resources.
During the formulation, I assume that the host overhead is negligible for everything but the resource consumption.

Calculation

The hackathon container scheduling system was written using the Julia programming language. I formulated the problem using the JuMP package from JuliaOpt. JuMP provides a metalanguage for expressing optimization problems, then passes off the actual calculation to configurable solvers. I used the open-source CBC solver from the COIN-OR project.

After running the calculation, the script makes API calls to move containers between hosts.

Speed

This problem’s complexity is NP-Hard, so the actual calculation times with the branch and bound algorithm became infeasible for real-world use when I tested more than 200 pools of containers. Solving a small data set of 20 pools across 4 hosts took seconds, but 200 pools across 10 hosts never converged after 8 hours on my Macbook. Fortunately the algorithm is parallelizable, so it is possible to utilize all resources on a multi-core device.
The calculation takes so long because it seeks a convergent, optimal solution. However, the practical application of this system means that slight variations away from the optimal solution are viable. This is due to uncertainty in the data used for the calculation and the excess capacity on each host. A way to deal with this summarized in the “Formulation” section bullet point below.

Possible Improvements

  • Data – When pulling historical data – instead of using an average, use the (average + standard deviation) or (average + 2*standard deviation) in order to account for fluctuations in container resource consumption.
  • Formulation – Instead of treating the problem as an optimization with an objective function – treat it as a constraint programming problem. To do this in the script – set a static CPU limit, modify the constraint to be less than this variable, and delete the objective function.

Applications

The system is designed to run on a cron job – perhaps every hour – in order to continuously reallocate containers between hosts as their resources change and new pools get added.
The most practical application of this system that I found was capacity planning. With slight modification, it is possible to use this script to determine the minimum number of hosts required to service all containers. Decreasing hosts, even by one, has the potential for significant cost recovery.
Disaster recovery is another application. If a host goes down and a new one needs to be provisioned to take its place, this system can reallocate all containers before the new host is available, then again reallocate when the new host is available.

Next Steps

The current scheduler is a proof of concept. Its most important use may be capacity planning for selecting the quantity and resources of hosts used for Quadra. To learn more about optimization in Julia, check out my previous blog post on the topic.

Suggested Blogs

  • Hitachi’s SASE: How Umbrella & Duo Delivered Identity and Security December 13, 2022 2 minute read
  • Why Using DNS for Protection Should Be Your First Line of Defense September 1, 2022 2 minute read
  • New Security for a World Where Everyone and Everything Are Connecting August 30, 2022 3 minute read

Share this blog

FacebookTweetLinkedIn

Follow Us

  • Twitter
  • Facebook
  • LinkedIn
  • YouTube

Footer Sections

What we make

  • Cloud Security Service
  • DNS-Layer Network Security
  • Secure Web Gateway
  • Security Packages

Who we are

  • Global Cloud Architecture
  • Cloud Network Status
  • Cloud Network Activity
  • OpenDNS is now Umbrella
  • Cisco Umbrella Blog

Learn more

  • Webinars
  • Careers
  • Support
  • Cisco Umbrella Live Demo
  • Contact Sales
Umbrella by Cisco
208.67.222.222+208.67.220.220
2620:119:35::35+2620:119:53::53
Sign up for a Free Trial
  • Cisco Online Privacy Statement
  • Terms of Service
  • Sitemap

© 2023 Cisco Umbrella