As a certified instructor at NVIDIA Deep Learning Institute I teach the following courses
Large language models (LLMs) and deep neural networks (DNNs), whether applied to natural language processing (e.g., GPT-3),
computer vision (e.g., huge Vision Transformers), or speech AI (e.g., Wave2Vec 2), have certain properties that set them apart
from their smaller counterparts. As LLMs and DNNs become larger and are trained on progressively larger datasets, they can adapt
to new tasks with just a handful of training examples, accelerating the route toward general artificial intelligence.
Training models that contain tens to hundreds of billions of parameters on vast datasets isn’t trivial and requires a unique
combination of AI, high-performance computing (HPC), and systems knowledge. The goal of this course is to demonstrate how to train
the largest of neural networks and deploy them to production.
Learn how to apply and fine-tune a Transformer-based Deep Learning model to Natural Language Processing (NLP) tasks.
In this course, you'll:
construct a Transformer neural network in PyTorch,
build a named-entity recognition (NER) application with BERT,
deploy the NER application with ONNX and TensorRT to a Triton inference server.
Upon completion, you’ll be proficient in task-agnostic applications of Transformer-based models.
Recent advancements in both the techniques and accessibility of large language models (LLMs) have opened up unprecedented opportunities to help businesses streamline
their operations, decrease expenses, and increase productivity at scale. Additionally, enterprises can use LLM-powered apps to provide innovative and improved services
to clients or strengthen customer relationships. For example, enterprises could provide customer support via AI companions or use sentiment analysis apps to extract valuable customer insights.
In this course you will gain a strong understanding and practical knowledge of LLM application development by exploring the open-sourced ecosystem including pretrained LLMs,
enabling you to get started quickly in developing LLM-based applications.
The evolution and adoption of large language models (LLMs) have been nothing short of revolutionary, with retrieval-based systems at the
forefront of this technological leap. These models are not just tools for automation; they are partners in enhancing productivity, capable of holding informed conversations
by interacting with a vast array of tools and documents. This course is designed for those eager to explore the potential of these systems, focusing on practical deployment
and the efficient implementation required to manage the considerable demands of both users and deep learning models. As we delve into the intricacies of LLMs, participants
will gain insights into advanced orchestration techniques that include internal reasoning, dialog management, and effective tooling strategies.
Retrieval augmented generation (RAG) pipelines are already changing every aspect of modern enterprise operation. There are countless
online tutorials demonstrating proof-of-concept-level naïve RAG applications incapable of dealing with large volumes of traffic
and large document volumes. This training lab will bridge this gap and discuss an opinionated best practice for production-level
deployment. From infrastructure sizing through breaking down end-to-end Helm-based deployment of NVIDIA NIMs, to customizing
individual pipeline components, we'll provide a high-level overview of steps your organization will have to take to transform
early proofs of concept into enterprise-grade deployments.
In this course, you'll go beyond prompt engineering LLMs and learn a variety of techniques to efficiently customize pretrained LLMs for your specific use cases—without engaging in the
computationally intensive and expensive process of pretraining your own model or fine-tuning a model's internal weights.
Using NVIDIA NeMo service, you’ll learn various parameter-efficient fine-tuning methods to customize LLM behavior for your organization.
This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs.
You’ll learn how to: · Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs)
· Use Numba to create and launch custom CUDA kernels
· Apply key GPU memory management techniques
Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.
Whether you work at a software company that needs to improve customer retention, a financial services company that needs to mitigate risk, or a retail company
interested in predicting customer purchasing behavior, your organization is tasked with preparing, managing, and gleaning insights from large volumes of data
without wasting critical resources. Traditional CPU-driven data science workflows can be cumbersome, but with the power of GPUs, your teams can make sense of data
quickly to drive business decisions.
Businesses worldwide are using artificial intelligence to solve their greatest challenges.
Healthcare professionals use AI to enable more accurate, faster diagnoses in patients. Retail businesses use it to offer personalized customer shopping experiences.
Automakers use it to make personal vehicles, shared mobility, and delivery services safer and more efficient.
Deep learning is a powerful AI approach that uses multi-layered artificial neural networks to deliver state-of-the-art accuracy in tasks such as object detection, speech recognition, and language translation.
Using deep learning, computers can learn and recognize patterns from data that are considered too complex or subtle for expert-written software.