Learn Ramly

Everything you need to go from a blank canvas to a finished RAM study: the concepts, the editor, the simulation engine and how to read the results.

Overview

Ramly is a browser-based tool for Reliability, Availability and Maintainability (RAM) analysis. You describe your system as a reliability block diagram (RBD), attach failure and repair behaviour to each piece of equipment, and Ramly runs a Monte Carlo simulation — thousands of simulated lifetimes — to predict availability, failure counts, production throughput and downtime cost.

1BuildDraw the RBD2DefineFailure & repair data3SimulateMonte Carlo runs4AnalyzeKPIs & charts5ReportPDF / Word export
The Ramly workflow — each step maps to a tab or button in the app.

No installation is needed; everything runs in the browser. Models are saved to your account, simulations execute on Ramly servers, and results stay attached to the model for comparison and reporting.

What is a RAM study?

A RAM study quantifies three related properties of a system:

TermMeaningTypical measure
ReliabilityThe ability of an item to perform its required function, under given conditions, for a given time interval.Mean time to failure (MTTF), failure rate
AvailabilityThe ability of an item to be in a state to perform when required — in practice, the fraction of time the system is operational.Uptime ÷ total time (often expressed in %)
MaintainabilityThe ability of an item to be retained in, or restored to, an operating state when maintenance is performed.Mean time to repair (MTTR)
UPDOWNUPUPtime to failure (TTF)time to repair (TTR)Availability = up time ÷ total time
A system alternates between up and down states. Availability is the long-run fraction of time spent up.

For repairable systems in steady state, availability relates to the two mean times: A = MTTF / (MTTF + MTTR). The mean time between failures is the full cycle: MTBF = MTTF + MTTR.

Why simulation instead of a formula?

Closed-form formulas only work for simple structures with constant failure rates. Real plants have redundancy, standby equipment, shared repair crews, limited spares, buffer tanks and ageing equipment — interactions no formula captures. Monte Carlo simulation handles all of this by simply replaying the system’s life many times with randomly drawn failure and repair times.

run 1run 2run 3… repeated thousands of times → distribution of availability, failures and cost
Each simulation run draws different random failure/repair times. Aggregating thousands of runs gives the distribution of outcomes, not just a single average.

Why run a RAM study?

Common reasons reliability engineers run RAM studies:

  • Compare design options — does a third pump (2-of-3) buy enough availability to justify its cost? Is cold standby sufficient, or is hot redundancy required?
  • Verify availability targets — demonstrate that a design meets a contractual or internal availability requirement before it is built.
  • Size spares and buffers — how many spare pumps should be in the warehouse? How much storage capacity bridges a typical outage?
  • Evaluate maintenance strategy — quantify the effect of preventive-maintenance intervals, crew sizes and logistics delays on production.
  • Quantify production loss — translate downtime into lost throughput and money, so reliability investments can be ranked against each other.
  • Find the critical equipment — importance rankings show which components dominate system unavailability, focusing engineering effort where it matters.

Standards & references

RAM analysis is supported by a body of international standards. The most relevant to the methods used in Ramly:

ReferenceTitle / scope
IEC 61078Reliability block diagrams — defines the RBD technique: series, parallel and k-of-n structures and how to evaluate them.
ISO 20815Petroleum, petrochemical and natural gas industries — Production assurance and reliability management. The framework standard for production-availability analysis in oil & gas projects.
ISO 14224Petroleum and natural gas industries — Collection and exchange of reliability and maintenance data for equipment. Defines equipment taxonomies and the failure/repair data on which studies are built.
IEC 60300 seriesDependability management — management and application guides for dependability programmes, including analysis techniques.
IEC 60050-192International Electrotechnical Vocabulary — Dependability. The standard definitions of reliability, availability, maintainability and related terms.
OREDAOffshore and Onshore Reliability Data handbook — a widely used published source of equipment failure and repair data (a data reference, not a standard).

This list is informative, not exhaustive. Always consult the current edition of a standard and your company’s own engineering practices for contractual work.

Getting started

  1. 1

    Sign in

    Ramly is currently in closed beta — registration needs an invite code. Every new account starts with a 14-day Professional trial.

  2. 2

    Explore the example models

    Your account comes pre-loaded with five real-world example models (oil & gas pumping, gas compression, air separation, offshore wind, green hydrogen). They are for learning — open any of them to see a complete, runnable study.

  3. 3

    Open a model

    Click a model card on the dashboard. The editor opens on the RBD tab showing the full diagram.

  4. 4

    Run a simulation

    Press Run Simulation (top right). The job appears on the dashboard with a live progress bar and typically finishes in seconds.

  5. 5

    Read the results

    When the job completes, open it to see availability, MTBF/MTTR, throughput and cost charts — explained in the Results section below.

The first time you open the dashboard, a 30-second guided tour points out the key areas. You can restart it any time from the Account page.

Building an RBD

The model editor has three tabs — RBD (the diagram), Simulation (parameters and advanced features) and Results. On the RBD tab you build the diagram from four block types, dragged in from the palette on the left:

BlockWhat it representsSystem is UP when…
ComponentA single piece of equipment with its own failure and repair behaviour.the component is working
Series blockA chain where everything is required.all children are up
Parallel blockRedundancy — k of n children must work. k is set in the Block Inspector.… at least k children are up
Standby blockRedundancy where reserve units wait (cold/warm) and start when a duty unit fails.enough units are running

Series structure

Blocks in series all must operate — one failure brings the path down. Reading is left to right, from the green S terminal to the red E terminal:

SPumpHeat Exch.CompressorE
A series path: pump, heat exchanger and compressor are all required.

Parallel (k-of-n) structure

Parallel branches fan out at a fork dot and rejoin at a join dot. The k value defines how many branches must work — a 2-of-3 pump set tolerates one pump being down with no production loss:

SPump APump BPump CE2-of-3
2-of-3 parallel redundancy: any two of the three pumps keep the system up.

Standby

In a standby group the reserve unit doesn’t run until needed — shown with a dashed border, the standard convention for standby in RBDs. A standby unit can have its own (usually lower) failure behaviour while dormant, configured on its component type:

SDuty PumpoperatingStandby PumpwaitingE
Duty / standby: the dashed unit waits and takes over on failure of the duty unit.

Hierarchy — systems inside systems

Containers nest. A typical plant model is a parallel set of trains, where each train is itself a series of equipment. The canvas shows the whole hierarchy at once, with each container drawn as a labelled box around its children:

ASU SYSTEM2-of-3 · ParallelTrain A · SeriesCOMPRESSORCOLD BOXTrain B · SeriesCOMPRESSORCOLD BOXTrain C · SeriesCOMPRESSORCOLD BOX
Nested containers: a 2-of-3 parallel system of three identical series trains.

Working on the canvas

  • Drag from the palette to add a block to the container you are viewing.
  • Click a block to select it and edit its properties in the Block Inspector (right panel).
  • Double-click a container to zoom into that subtree; use the breadcrumb to navigate back up.
  • Collapse / expand — the chevron in a container’s header collapses it to a single box; the round chevron button on a collapsed box expands it again. Containers with more than 6 children start collapsed so large models stay readable. Expand all / Collapse all are in the toolbar.
  • Delete a selected block with the Delete or Backspace key.
  • PNG export — the toolbar’s PNG button saves the current view as an image for slides and reports.
  • Layout is automatic — blocks are always arranged in clean engineering style; there is nothing to line up by hand.

Component types & failure data

Equipment behaviour lives in component types — named templates such as CENTRIFUGAL_PUMP defined once and assigned to any number of component blocks. Manage them in the Types tab of the editor’s left panel. A type defines, at minimum:

  • Failure distribution — how long the item runs before failing.
  • Repair distribution — how long a repair takes once started.

and optionally:

  • Logistics delay — waiting time before repair work can start (mobilisation, parts on order).
  • Standby failure distribution — a separate (usually slower) failure behaviour while a unit is dormant in a standby group.
  • Preventive maintenance — a PM interval plus a PM duration distribution.
  • Costs — repair cost and PM cost per event, used by the cost model.
  • Spare pool — link the type to a spare-parts pool and define the replacement time.
  • Failure modes — split one item into independent sub-modes (e.g. bearing vs. seal failure), each with its own failure and repair behaviour.

Distributions

All times are in hours. Seven distributions are available; the shapes below are drawn from the actual probability density functions:

Exponential

Constant failure rate (memoryless). Parameter is the MTTF in hours.

Weibull

Shape < 1 infant mortality, = 1 exponential, > 1 wear-out. Scale + shape.

Normal

Symmetric around the mean. Mean + standard deviation. Common for repairs.

Lognormal

Right-skewed; long repair tails. Log-mean + log-std-dev.

Uniform

Any value between min and max equally likely.

Triangular

Min, most-likely (mode), max. Good for expert estimates.

Constant

Fixed duration every time — e.g. a planned 8-hour PM task.

For the exponential distribution the parameter you enter is the MTTF (mean time to failure) in hours — e.g. entering 8,760 means the item fails on average once a year of running time.

You can also import component types in bulk from CSV or Excel, and save reusable libraries — see Import & export.

Advanced features

All of these are configured on the Simulation tab of the editor.

FeatureWhat it models
Spare poolsA warehouse with a starting quantity of spares and a replenishment lead time. When a component fails and no spare is on the shelf, the repair waits — stock-outs are reported in the results.
Crew groupsA limited number of maintenance crews shared by a set of equipment. If every crew is busy, additional repairs queue until one frees up.
Common-cause failuresA CCF group fails several components simultaneously from one shared cause (e.g. loss of cooling water takes out both pumps), with its own failure and repair distributions.
Storage buffersA tank or battery with a capacity and fill rate assigned to a block. When upstream equipment trips, the buffer keeps production going until it empties — downtime only counts after that.
Throughput capacityReal production units (e.g. t/h) per block and for the system. With capacities set, results report actual throughput and production lost in real units, not just percent.
Cost modelDowntime cost per hour plus per-event repair and PM costs. Results then include a cost breakdown (downtime vs. corrective vs. preventive).
ScenariosSaved variants of one model — e.g. "2 pumps" vs. "3 pumps" — so you can run and compare design options side by side without duplicating the model.

Running a simulation

Key parameters on the Simulation tab:

ParameterMeaning
Number of simulationsHow many independent lifetimes to simulate. More runs → smoother statistics. Up to 100,000 per run (plan-dependent).
Simulation durationThe mission time of each lifetime, in hours (e.g. 87,600 h = 10 years, the maximum).
Time windowSplits the duration into reporting buckets (e.g. yearly windows) so you can see how availability evolves over the asset life.
Random seedOptional. Fixing the seed makes results exactly reproducible run-to-run.
Tracked blocksWhich blocks get detailed per-block KPIs in the results.
Importance / sensitivityOptional analyses: component importance ranking, and the effect of ±20% changes in MTTF and MTTR per type.

Press Run Simulation. Jobs run on the server — you can leave the page; progress is visible on the dashboard, and finished results stay attached to the job.

Start with ~1,000 simulations while iterating on the model, then increase for final numbers. The convergence card in the results suggests how many runs are needed for a stable answer.

Reading the results

The results page is organised top-down, headline first:

  • System availability — the average across all runs, with P50 and P90 percentiles. Because high availability is good, P10 is the pessimistic case (9 out of 10 simulated lifetimes did better) and P90 the optimistic one. For "bad" quantities like failure counts it flips: P90 is pessimistic.
  • Convergence — the confidence-interval width of the availability estimate and a suggested number of runs for a stable result.
  • System KPIs — failures, MTBF, MTTR, throughput and lost hours over the mission.
  • Availability trend — per time window, when the duration is split into multiple windows.
  • Block KPI table — availability, failures, MTBF, MTTR per tracked block; with throughput capacities configured it adds actual throughput and production lost in real units.
  • Cost breakdown — downtime vs. repair vs. PM cost, when a cost model is defined.
  • Spare pool statistics — usage and stock-outs per pool.
  • Importance ranking — which components contribute most to system unavailability (top 15 shown).
  • Sensitivity — how system availability responds to ±20% changes in each type’s MTTF and MTTR, highlighting where better data or better equipment pays off.

Comparing runs

On the dashboard, select several completed jobs and open the comparison view to see their KPIs side by side — the quickest way to evaluate design alternatives that you modelled as separate scenarios.

Reports, import & export

Reports

Generate a formatted study report as PDF or Word (DOCX) from a completed simulation. The report includes the KPI tables, distribution histograms with P10/P50/P90 reference lines, and your own key findings and recommendations — fill those in under the Report section of the Simulation tab before exporting.

Import & export

DataFormats
Full modelJSON export and import (round-trip safe) — use it for backup or sharing
Component typesCSV and Excel (.xlsx) import/export, with conflict resolution on import
Block templatesJSON — save a sub-diagram once, reuse it across models
Component librariesJSON — shareable sets of component types
RBD diagramPNG snapshot from the canvas toolbar

Plans & limits

Quotas are counted in runs — one run is one simulation job submission, regardless of size. Every new account starts with a 14-day Professional trial.

PlanRuns / monthModelsComponent types
Starter515
Professional250UnlimitedUnlimited
Team1,000 (pooled, up to 5 seats)UnlimitedUnlimited
EnterpriseUnlimitedUnlimitedUnlimited

During the closed beta there are no payments — accounts run on the trial and plans shown on the pricing page are indicative.

Glossary

TermDefinition
RBDReliability block diagram — a success-path representation of a system: if a path of working blocks connects start to end, the system is up.
MTTFMean time to failure — average operating time until an item fails.
MTTRMean time to repair — average time to restore a failed item.
MTBFMean time between failures — average time from one failure to the next for a repairable item (MTTF + MTTR).
AvailabilityFraction of time a system is able to perform its function.
k-of-nA parallel group that needs at least k of its n members working.
Standby (cold/warm)Redundant units that wait instead of running. Cold standby does not fail while waiting; warm standby can fail at a reduced rate.
CCFCommon-cause failure — one shared cause failing multiple components at once.
PMPreventive maintenance — planned maintenance performed at intervals, as opposed to corrective (after-failure) maintenance.
Spare poolWarehouse stock of replacement parts shared by components, with a replenishment lead time.
Monte Carlo simulationEstimating outcomes by repeating an experiment many times with randomly sampled inputs.
P10 / P50 / P90Percentiles of the simulated outcome distribution: 10%, 50% (median) and 90% of runs fall at or below these values.
Questions or feedback on this guide? Email hello@ramly.io.