Big Data Word Map.png

The 'Big Picture'

for Engineering Data

Engineering has lagged other business domains like finance or sales in benefiting from big data analytics. 

In engineering, a part's geometry isn't just highly correlated with part function, but also with business and performance metrics like cost, manufacturing process and part performance.  

CADseek Analytics fully automates the process of geometric comparison across entire datasets.  Any available meta-data can be used for report filtering so that aluminum parts don't get compared to those made of titanium. 

Projects such as consolidation, standardization, or cost variance discovery, which were previously prohibitive due to the tens-of-thousands of required hours, are made feasible with a report that takes less than an afternoon to run.  

What duplicate parts can be consolidated ...

Where  are the best opportunities to develop a standard design...

How much part duplication exists with this acquired company ...

Should these aluminum parts that are 98% similar have a cost that varies by  37% ... 

Why do these similar parts have such widely varying warranty performance  ...


Use Cases: 

Consolidation & Standardization


Price Variance Discovery

Performance Variance Discovery

Data Cleanup 

An Analytics reports groups models by a user defined level of similarity, providing a road map for eliminating duplication and discovering opportunities for standardization.    

Analytics can compare one dataset to another, allowing duplication and opportunities for standardization to be discovered.  

Often supplier parts are purchased from multiple vendors at different prices, but sometimes at too great a difference to be justified.

CADseek allows meta-data from any source, so analysis can be run on performance metrics such as warranty.  

Analytics can be used to find models that lack standard attributes.  




The analysis is performed in a completely automated fashion.  In a dataset of 100,000 models Analytics will perform 9.99B comparison and summarize the similarity, with attribute values, in an interactive report.  

Cross-Dataset Analysis

A report can be run within a single dataset, or across datasets.  Running reports across datasets allows the comparison of the parts of one division to another, or of the parent company to an acquisition.

User-Defined Similarity Cutoff

Each time a dataset is analyzed a different similarity threshold can be set, e.g.,  show all models with at least 91% or greater similarity, or a similarity range can be set.  


Each time a dataset is analyzed a different set of filters can be applied to slice the data with any available meta-data field and attribute values, in combination, and with numerous Boolean operators, allowing a report to be limited to a certain material type, surface finish, author, commit date, division, etc. 


CADseek Analytics can be performed within or across datasets with a wide mix of different CAD formats, assembly models and neutral formats like .igs, .stl and .stp.  

Interactive & Integrated

Any model can be inspected and viewed in 3D from within the report (see image below), with links to CADseek Connect to allow viewing of model attributes and search. 



Each Analytics report that's generated can be viewed in three different formats. 

The Dataset Overview report shows the level of duplication and similarity within or between datasets. The example report at right shows the dataset has 2,748 pairs of duplicate items, and 229 pairs of items that are from  99% to 99.999% similar.     

Dataset Overview Report 

The Pair-Models Report shows each pair of models within a dataset or even across datasets, which have similarity above the threshold set for that report


e.g., show me all models with similarity of 85% or greater. 


The paired-model format means that if model A1 has three similar models at or above the similarity threshold, then model A1 will appear in the left column of the report three times.

Paired-Models Report 

Grouped -Models Report 

The Grouped-Models report shows the same data as the paired-models report, but the data is presented on a singled row.  For example, if model A1 has similarity to three other models at or above the threshold, then A1 will appear in the left column just one time, and all three similar models will appear on the same row to the right.    


Like the Paired-models Report, the grouped models report can be run within a dataset or across two datasets.      

Customized Reports

Each time an Analytics report is run the user can choose any combination of attribute filters to slice the data and make customized reports, i.e., a report for a division, manufactured parts, a specific vendor's parts, a part with the attribute 'valve', etc. 

Tab-Delimited Format

Each Analytics report can be exported in a tab-delimited format so that the similarity data can be aligned with other data such a s cost or performance, to analyze variances. 

The ability to export a report allows further analysis is other software applications such as Microsoft Excel.  

CADSEEK provides the ability to import virtually an unlimited number of meta-data fields and attribute values.   The Meta-data Report lists each meta-data field, all attribute values that exist for each meta-data field, and each model that has been tagged with that attribute value. 

Because each model is visualized with a thumbnail for each attribute value, it makes it incredibly easy to spot attribute errors, e.g., that's clearly a type F and not a Type K. 

The Meta-Data Report creates an incredibly useful road map for data cleansing or data migration projects. 


Meta-Data Report 


Return on Investment

Analytics is able to automate work that would take humans thousands and thousands of hours to complete.   For example, if a business with 100,000 models is acquiring a business with 50,000 models, an Analytics report to compare the two datasets would be the result of 2.5B comparisons, which might take a couple hours or so to produce.  For the human-based approach..., suppose 90% of the 2.5B comparisons could be eliminated because of good quality attributes, and the remaining 250,000,000 comparisons could be completed in just1 minute each.., then the project would require over 2,000 years for a single analyst to complete at a cost of over $130 million. 

Projects of such scale would obviously never be funded.  But the ability for CADSEEK Analytics to automate the process and create a project road map in just a few hours changes the equation and makes such projects not only feasible but highly profitable.  The ROI potential for two common analytics projects are shown below. 

1. Eliminating Duplication.  The elimination of functionally duplicate parts creates economies of scale and eliminates unnecessary purchasing and inventory carrying costs.  The assumptions used in the table at right for redundancy rates and carrying costs are based on research performed by Aberdeen Group and the Parts Standardization an Management Committee of the Department of Defense. 

"According to our research, as many as 30% to 40% of manufacturer's parts are duplicates or have acceptable substitutes." 

A study by the Parts Standardization and Management Committee of the Department of Defense calculated a five year inventory carrying cost for an average part of $3,750 per part.   

2. Vendor Price Discrepancies.  While companies routinely purchase duplicate parts from more than one source, the price charged by each supplier should be very similar.  When they are not, Analytics makes it easy to group and align these identical or highly similar parts with cost information to spot pricing variances like the one shown in the table at right. 

Sample Analytics Similarity Report

Analytics Brochure

Chrome or Edge browsers work best.

White Paper - Applying Analytics to Engineering Data



Heavy Machinery
Electrical / Electronics
Life Science
CADseek Technology
Use Cases
Blog Posts