Complexity & Customization

A lot of people underestimate technical complexity, the costs of managing complexity, and the staff skills to run a semiconductor manufacturing facility. Everything gets hard at nano-scale dimensions.  Most articles talk about CMOS manufacturing, where the foundry has created a very highly standardized process flow, that many customers have little choice about, to make computer processors or memories.  This world is highly risk averse, and change averse out of necessity. 

I like backpacking in the back country for many days on end, so this posting will be sprinkled with fun references to help provide perspective on technology development, which has similarities.

Why will no one work with my project?


What if you are a deep tech optical, or biotech fluidic, or MEMS company, or doing something with quantum and qubits, and there isn't a standard process flow for making your device? You want to introduce novel materials not used in conventional CMOS? Well, that's a really tough sell to many foundries, and here is why!  Big, huge financial RISKS!

Suddenly the client finds out that there is a much shorter list of foundry sites that are willing to work with a company without established sales, and a very non-standard process flow, to build a novel device never made before, using foreign materials.  Think something like quantum computer chips, some biochips with gold structures, and novel devices taking advantage of special material properties. 

From the foundry perspective, they see nothing but risks, and high effort requiring a larger engineering staff, and small volume production initially to recover their investment.  And what if this custom process causes a catastrophe with one of the above standard processes and you have some really angry large dollar customers, like Apple, Qualcomm, AMD, or Nvidia? How would it look for the manager that approved a small customer with a custom process, which contaminated the fab and caused everyone else a serious supply chain issue?  That manager would be gone very quickly. You simply won’t be sharing a cleanroom with those other customers.

Maybe you are talking to the wrong people about foundry services?

What a custom process integration means and who to work with? (common for deep-tech projects)

What you are trying to do has not been done before. There literally isn’t much prior art on the best way to build your device successfully the first time around. You therefore need a different kind of foundry facility and relationship to get your startup on the first rung of that manufacturing ladder.  You need a custom R&D foundry that will offer some volume production agreement.  Most of these companies you have never heard of before.  These foundries have larger engineering teams, they are versatile, and accommodating of startup companies, and willing to work through the complexities of development with deep-tech companies. But they may be using older equipment, not as clean, and potentially not as reproducible as you would expect from a place dominated by only a couple kinds of highly standardized and optimized technology.

It takes a special kind of buddy to go backpacking with you for say two weeks straight. Similarly, it take a special foundry for your deep-tech journey. So one aspect to consider is who is going to be your backpacking buddy(s)? Choosing a foundry to do a custom integration is business development for a deep-tech startup with nanofabrication needs. You also cannot separate the business decision from the technical decision either. You better not choose someone who has only car camped all their life and has always gone home the first sign of rain, or worse yet someone who has never slept without a bed. They will wimp out at some point during your journey, causing you to abort the trip and re-start with another supplier. Ouch! Very expensive mistake.

And what if you don’t get along with the buddy that is essential for the journey? That can be difficult to. Who is going to be there to help keep the peace, keep everyone focused on the journey not the conflict. Do you need another buddy that you are comfortable with and you know will not abandon you when things get tough?

This is a match making exercise that requires that someone vet the foundry for their equipment capabilities, their deep-tech development skills, and their manufacturing skills: ISO9002 certification, Change Management, Manufacturing Execution System, Tool Qualification procedures, Yield enhancement team, Failure Analysis lab, Test Floor, Maintenance Frequency, Equipment supplier contracts, Maturity Level Spec, Materials available, and procedures for introducing new materials without impacting other projects, decontamination mitigations, metrology, and for some projects Trusted Foundry certification.

These questions are like the shake down of someone’s camping gear to understand the difference between a car camper with that 60F sleeping bag for sleepovers when you were 6 years old, and the backpacker with the 10F water resistant sleeping bag that weighs less than your socks, and doubles as a hammock. Someone who has never been backpacking, might not be able to assess the technical capabilities of the foundry for fit for a particular deep-tech challenge. You have to grill them for way more than just cost, and you need to know what will be important for the success of the journey.

If you don’t know what questions to ask, do you really know how to pick your foundry partner?

The technical risks, the non-recurring engineering (NRE) costs for a custom process flow, and uncertain timelines for such development projects are really daunting for small companies with big dreams, and for small foundries with an expensive facility to keep running. Understanding the scope, cost and timeline ACCURATELY is critical for seed and series A round investor relations and for the foundry themselves.  These foundries make their business model selling engineering hours, not wafers necessarily. The startup company generally wants to do the minimum to keep costs down, and the foundry wants to up-sell to increase revenue, so there is always a bit of tension there around the budget, but scope and cost can often be negotiated to benefit the project and both companies.

Do you know what questions to ask to understand the value offered with this upselling pressure? Is it necessary, or is it cost overruns.

How much funding do you need to get to production? How long will it take?  The real truth to those questions in the case of the start of a deep-tech nano-fabrication project that has never been done before, is that NO ONE CAN KNOW FOR SURE.  This is even true of some 2nd sourcing projects if the 2nd manufacturing site is not identical to the first (a common problem). 

The trick then is to identify the big risks early (fail fast philosophy) and fix those issues as you are evaluating your prototype. I’ll have more posts about how you do that in the future. But first, let’s take a step back and look at the big picture of a mountain forest you want to travers (the project) and tasks that need to all be done including all those complex details (the many, many trees) to reach your goal. You don’t know all the trees at the start of the path. You only know the path and the forest at the beginning of the project, the big picture, with no detail. There is so much complexity that you don’t even know about at the start of the project.

Managing Complexity

What I tell people is that when you have a prototype you have only done 10% of the work towards production, because that has been my experience. Think about it. Do you really know all the ways your device can fail before you have built one? Of course not.  Therefore, a prototype is not a product, it is a lemon so sour that only early adopters are hardy enough accept it. You actually need to mitigate the risks and prevent each and every failure mode that matters before you have a viable product ready for wide adoption. That may even require a redesign of your device in some cases.

Also, if you built your prototype in an academic facility, then how might it break when that custom process is changed in the migration to a commercial R&D to production foundry, with different tooling, and tool conditioning?  Often something changes in a “fab transfer” and you discover new modes of failure that will render your device useless until the nanoscale changes that took place are fixed.

I like to make the analogy that the whole journey feels like hiking in the woods in a fog, taking one step at a time while not quite sure where you are, or what you will encounter beyond your limited field of perception. As a result, it is very easy for founders to promise too much, and deliver too little progress, thus disappointing investors and getting cut off from future funding before finishing the project.  Honesty about the fog of not knowing is required to avoid disappointing the people who are funding the project.  There are ways to talk about certainty and progress while in a fog, that are helpful.

Getting ahead of all those technical risks along the way is done by using a number of proven systematic engineering approaches that I learned on the job, not in school. All of them felt awkward the first time they were mandated. Now I really appreciate them.

Finding Risks & Challenging Assumptions

First, you need to build in risk discovery and mitigation into your project plan, budget, and project management.  There also needs to be a financial buffer to get through unexpected problems (the unknown-unknowns) mid-way through the project that some people refer to as the "valley of death" that destroys many startups.  If you plan to backpack for 10 days, you bring 13 days of food and an extra water filter, because you can’t ask anyone for help when you are the first one charting this path. There is no substitute for deep and insightful planning… it is how we got to the moon after all.

While you cannot predict all the challenges you will encounter at the beginning of a complex technical project, you can anticipate quite a few of them if engineering teams slow down and use structured approaches for identifying risks and prioritizing them.  My favorite tool is the Failure Mode & Effect Analysis (FMEA), which I will talk about in another post on risk mitigation. You can also intentionally vary the process too much, just to see if something breaks to identify which are the “critical parameters” for ensuring success, and what device performance metrics are affected by each process.  This approach might be called a “windowing experiment”. In the semiconductor industry, people might use terms like a “process cliff”, where if you go past a critical extreme the device no longer works. This could be an alignment error, film thickness, feature size or depth, or something more subtle like stress or strain. Computer modeling with software like COMSOL Multiphysics can also help discover failure modes more cost effectively through simulation, taking into account more than expected manufacturing variation, prior to spending money on wafer fabrication, unless the device is pretty simple.

AI won’t help you, because you won’t have any data to train your model on to make a twin of your device with all the input variation from the process, until much later in the project when you are already in production anyway. The semiconductor industry makes heavy use of AI today, but they are running the same process all the time, only the circuits change, not the transistors, so they can use AI because they have past products to train the models with.

An experienced person who has taken technology to production a few times, has seen some of the pitfalls in past projects and can recognize collaborative dysfunction, and technical risks earlier than someone who has never been through the life-cycle of development before. Like that guy that can make out the shape of a bear, behind several trees, in the fog up ahead, that you think is a rock. You need someone with this kind of experience on your team to help you spot hidden dangers early.

My favorite “bear” is some kind of technical assumption. It might be an overly simplistic model for how a device should work, or how a process would affect the device performance, that people latch onto way too dogmatically without having validated it to ensure the mental model is correct. Sometimes assumptions are built on other assumptions into this big house of cards ready to topple if any of the assumptions turn out to be debunked. To someone inexperienced, an assumption is true until proven false. To someone experienced, it is the bear that is about to knock you down and stomp on your former existence. I often point out assumptions explicitly during team discussions, and add project scope to test if those assumptions are true or fiction. Assumptions should never be allowed to persist in the world of fiction for very long. All assumptions can turn out to be a bear, not a rock, and the scope of effort and drama associated with dealing with an angry bear is much, much higher than the response to a rock, even a large rock bigger than a bear.

Mind the bear, or in this case those assumptions everyone accepts without questioning.

Hopefully, you also have a list of highly suspected theoretical risks that would cause your prototype to fail, either during manufacturing or as a field return, combined with the assumptions, which defines your risk mitigation strategy and scope of effort. Out of that FMEA comes a set of tasks to perform on your prototype. Then you realize that the main purpose for the existence of your prototype is be the test vehicle to characterize all the risks and test all the assumptions you can afford to explore, NOT to send to early adopters and investors labeled as a product. Because a prototype is not a product.

If you are making a prototype, and you have no plans to run experiments on the very first lot, then you are not doing risk mitigation. Why would you wait until later to start risk management, when it is the longest task in any deep-tech timeline, and often the critical path in your Gantt chart?

Divide and Conquer (using milestones)

Technology maturity milestones are another structured engineering practice, that serves as waypoints towards manufacturability that can be communicated to investors like a map, to show them where you are in a journey that is otherwise obscured.  The SEMI.org industry group has created such a standard from inputs from across the semiconductor industry and has this training video explaining the phases of the standard that is quite useful to learn about.  I encourage everyone to watch this video, and ask me about the finer points. When I talk about maturity of a technology, I am talking about how far down the milestone process is the project and how close it is to production.

Returning to the fog analogy is very insightful here for how milestones help cut through the fog of uncertainty. For a development project, the very next milestone generally becomes more visible near the completion of the current milestone.  For example, one generally knows what it will take to introduce a new design, set up new processes and a complete process integration to create a prototype with the specified geometries, and the pre-work to get all that functioning at target.  Never assume the prototype is going to work however!  First time success is the exception. More often than not, you will complete the prototype testing and have a list of problems that need to be fixed. But now you have a list that you didn’t have at the start of the project! Hence the fog of uncertainty clears as the data accumulates and your understanding improves. You are still in a fog, but you can see further into the forest and the next milestone becomes visible as the last one is left behind..

Stated more directly, completion of a milestone produces a to-do list for the next phase of the project, which gives confidence to investors that the effort is on track and the scope is now evident for the next milestone. It also establishes where on the map you actually are along your journey. So, I recommend that funding rounds be aligned with these milestones.

The answer is often in your trash bin

Do not think that zero yielding prototype material is wasted money. The fact is that a LOT OF USEFUL INFORMATION is derived from the: How did it fail? How far off target was the performance? Does it work slightly better at a particular process setting? This is data and understanding that informs how to make things work more optimally, because going in the opposite direction from “that’s bad” is in general where you will find good performance. This is the basis of some development philosophies where the emphasis is on “failing fast”, and therefore learning quickly what works and doesn’t work. Knowledge of what success looks like is derived primarily from your failures. If you have ever tried snowboarding for the first time, you learn really quickly what NOT TO DO by crashing a bunch.

Say you have a fairly consistent run of multiple batches of wafers that work pretty good! But again, don’t be too over-confident that things will become easy now.  One reason why is that different kinds of failures happen at different probabilities.  The less probable failure modes tend to hide behind the more probable ones that you notice right away, and you discover them later in the project when the big issues have been solved and you are ramping up to higher volume of production, thus making more devices, thus noticing that new pile of rejects.  These new issues may also be harder to detect, and could lead to reliability complaints and field returns from your early adopters.  Having good testing protocols to capture and contain all kinds of failures, characterizing them because they matter, and classifying them is critical to early R&D projects. The sooner you notice and investigate the low probability failures, the better off you will be when you ramp production.

If you are just throwing away your rejects, then you are failing your investors. Your rejects are precious learning opportunities, that need to be deconstructed with “failure analysis” and understood so that you can find the root cause and make that problem go away. If you are not classifying the failure signatures, and deconstructing your devices to learn why they fail, then you are in fact failing your investors who are trusting you to be effective in your development efforts. Don’t throw insights into the trash bin please.

Time to walk the talk

While this all sounds very straightforward to the uninitiated, it is quite a different story in practice, and few startup companies follow through with these structured techniques completely and consistently throughout the project. There is no way to avoid this effort or cut corners. You either do it early or late in the projects and it always cost more to fix risks later in terms of damage to your brand from field returns, scrap losses, and inability to satisfy customer demand.

Returning to the fog analogy once again. Imagine you are on the hike in the woods, in a fog, with dangers that can kill you, and you are unsure exactly where you are, or where you are going, with your backpack having limited supplies, and you have only your own ingenuity to navigate the obstacles in your path, then and only then can you empathize with the feeling of deep-tech research and development in the nanofabrication space.  Wouldn’t you want someone experienced on this journey?

It helps to have someone experienced in taking a project through such a technical journey.  It’s like a semiconductor sherpa that guides the technical and business teams through the fog, and keeps your company aware of some of the dangers along the path, and can keep you working collaboratively with your foundry, could be truly invaluable to the success of your project.  Think about reaching out for support with your effort.

Next
Next

Disruptive Miniaturization