Data Structures

Arrays - ds easy problem solving (basic) max score: 10 success rate: 93.29%, 2d array - ds easy problem solving (basic) max score: 15 success rate: 93.16%, dynamic array easy problem solving (basic) max score: 15 success rate: 86.74%, left rotation easy problem solving (basic) max score: 20 success rate: 91.15%, sparse arrays medium problem solving (basic) max score: 25 success rate: 97.28%, array manipulation hard problem solving (intermediate) max score: 60 success rate: 60.99%, print the elements of a linked list easy problem solving (basic) max score: 5 success rate: 97.19%, insert a node at the tail of a linked list easy problem solving (intermediate) max score: 5 success rate: 95.26%, insert a node at the head of a linked list easy problem solving (basic) max score: 5 success rate: 98.34%, insert a node at a specific position in a linked list easy problem solving (intermediate) max score: 5 success rate: 97.01%.

Learn Python practically and Get Certified .

Popular Tutorials

Popular examples, reference materials, learn python interactively, learn ds & algorithms.

A computer program is a collection of instructions to perform a specific task. For this, a computer program may need to store data, retrieve data, and perform computations on the data.

A data structure is a named location that can be used to store and organize data. And, an algorithm is a collection of steps to solve a particular problem. Learning data structures and algorithms allow us to write efficient and optimized computer programs.

Our DSA tutorial will guide you to learn different types of data structures and algorithms and their implementations in Python, C, C++, and Java.

Do you want to learn DSA the right way? Enroll in our Interactive DSA Course for FREE.

  • Introduction

Data Structures (I)

Data structures (ii), tree based dsa (i), tree based dsa (ii), graph based dsa.

  • Sorting and Searching

Greedy Algorithms

  • Dynamic Programming

Other Algorithms

  • How to learn DSA?

DSA Introduction

  • What is an algorithm?
  • Data Structure and Types
  • Why learn algorithms?
  • Asymptotic Notations
  • Master Theorem
  • Divide and Conquer Algorithm
  • Types of Queue
  • Circular Queue
  • Priority Queue
  • Linked List
  • Linked List Operations
  • Types of Linked List
  • Heap Data Structure
  • Fibonacci Heap
  • Decrease Key and Delete node from Fibonacci Heap
  • Tree Data Structure
  • Tree Traversal
  • Binary Tree
  • Full Binary Tree
  • Perfect Binary Tree
  • Complete Binary Tree
  • Balanced Binary Tree
  • Binary Search Tree
  • Insertion into B-tree
  • Deletion from B-tree
  • Insertion on a B+ Tree
  • Deletion from a B+ Tree
  • Red Black Tree
  • Insertion in Red Black Tree
  • Deletion from Red Black Tree
  • Graph Data Structure
  • Spanning Tree
  • Strongly Connected Components
  • Adjacency Matrix
  • Adjacency List
  • DFS Algorithm
  • Breadth-first Search
  • Bellman Ford's Algorithm

Sorting and Searching Algorithms

  • Bubble Sort
  • Selection Sort
  • Insertion Sort
  • Counting Sort
  • Bucket Sort
  • Linear Search
  • Binary Search
  • Greedy Algorithm
  • Ford-Fulkerson Algorithm
  • Dijkstra's Algorithm
  • Kruskal's Algorithm
  • Prim's Algorithm
  • Huffman Code
  • Floyd Warshall Algorithm
  • Longest Common Subsequence
  • Backtracking Algorithm
  • Rabin-Karp Algorithm

Why Learn DSA?

  • Write optimized and scalable code - Once you have knowledge about different data structures and algorithms, you can determine which data structure and algorithm to choose in various conditions.
  • Effective use of time and memory - Having knowledge about data structures and algorithms will help you write codes that run faster and require less storage.
  • Better job opportunities - Data structures and algorithms questions are frequently asked in job interviews of various organizations including Google, Facebook, and so on.

How you can learn data structure and algorithms?

Interactive dsa course.

Want to learn DSA with Python by solving quizzes and challenges after learning each concept? Enroll in our DSA Interactive Course for FREE.

Learn DSA from Programiz

Programiz offers a complete series of easy to follow DSA tutorials along with suitable examples. These tutorials are targeted for absolute beginners who want to dive into the field of computer programming.

Learn DSA from Books

Learning from books is always a good practice. You will get the big picture of programming concepts in the book which you may not find elsewhere.

Here are some books we personally recommend.

  • Introduction to Algorithms, Thomas H. Cormen - it is one of the best books in algorithms and covers a broad range of algorithms in-depth
  • Algorithms, Robert Sedgewick - it is the leading textbook on algorithms and is widely used in colleges and universities
  • The Art of Computer Programming, Donald E. Knuth - this book is considered best if you know the subject and are looking for deeper understanding

Learn DSA through visualization

Once you have some idea about data structure and algorithms, there is a great resource at Data Structure Visualizations that lets you learn through animation.

Browse Course Material

Course info.

  • Prof. Erik Demaine

Departments

  • Electrical Engineering and Computer Science

As Taught In

  • Algorithms and Data Structures

Learning Resource Types

Advanced data structures, assignments.

  • There will be a weekly one-page assignment, 10 assignments in total.
  • You may skip any one problem, or we will ignore the problem with the lowest grade. If you volunteered to scribe twice, we will ignore the lowest two grades.
  • The answers must be typeset in LaTeX. The answers must fit in one page, or your solution will not be read. Use at least 10 pt font and 1 inch margins. This rule is meant to prepare you for writing research publications: one often has to explain great ideas in a very limited number of pages.
  • Submissions must be made online and consist of a compiled PDF document.
  • Grades and comments will be posted online.
  • Solutions do not need to include all calculations, trivial details etc. Just prove to us that you found the solution, and you understand it well.
  • 0 = You didn’t get it. Filling one page to the brim does not mean you can’t get zero. Please don’t write stuff you know is wrong.
  • 1 = Your solution was ultimately a good one, but the write-up contained significant errors or omissions.
  • 2 = (We think) you got it.

MIT Open Learning

  • 90% Refund @Courses
  • Data Structure
  • Queue Meaning
  • Queue Introduction
  • Queue Operations
  • Queue Implementation
  • Types of Queue
  • Queue Application
  • Priority Queue
  • Stack using Queue
  • Queue Interview Problems

Related Articles

  • Explore Our Geeks Community
  • Data Structures Tutorial
  • Introduction to Data Structures

Data Structure Types, Classifications and Applications

Overview of data structures.

  • Introduction to Linear Data Structures
  • Introduction to Hierarchical Data Structure
  • Overview of Graph, Trie, Segment Tree and Suffix Tree Data Structures

Different Types of Data Structures

  • Array Data Structure
  • String Data Structure
  • Linked List Data Structure
  • Stack Data Structure
  • Queue Data Structure
  • Introduction to Tree - Data Structure and Algorithm Tutorials
  • Heap Data Structure
  • Hashing Data Structure
  • Graph Data Structure And Algorithms
  • Matrix Data Structure
  • Advanced Data Structures
  • Data Structure Alignment : How data is arranged and accessed in Computer Memory?
  • Static Data Structure vs Dynamic Data Structure
  • Static and Dynamic data structures in Java with Examples
  • Common operations on various Data Structures
  • Real-life Applications of Data Structures and Algorithms (DSA)

Different Types of Advanced Data Structures

What is data structure:.

A data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently.

A data structure is not only used for organizing the data. It is also used for processing, retrieving, and storing data. Different basic and advanced types of data structures are used in almost every program or software system that has been developed. So we must have good knowledge of data structures. 

Data structures are an integral part of computers used for the arrangement of data in memory. They are essential and responsible for organizing, processing, accessing, and storing data efficiently. But this is not all. Various types of data structures have their characteristics, features, applications, advantages, and disadvantages. So how do you identify a data structure that is suitable for a particular task? What is meant by the term ‘Data Structure’? How many types of data structures are there and what are they used for?

What is Data Structure: Types, Classifications and Applications

What is Data Structure: Types, Classifications, and Applications

We have got you covered. We have made a complete list of everything about what data structure is, what are the types of data structures, the classification of data structures, the applications of each data structure, and so on. In this article, we will discuss every aspect of each data structure to help you choose the best one in just minutes.

Table of Contents

  • What is Data Structure?
  • How Data Structure varies from Data Type?
  • 1) Static data structure
  • 2) Dynamic data structure
  • Non-linear Data Structure
  • Characteristics of an Array
  • Applications of Array
  • Real-Life Applications of Array
  • Characteristics of a Linked List
  • Applications of Linked List
  • Real-Life Applications of Linked List
  • Characteristics of a Stack
  • Applications of Stack
  • Real-Life Applications of Stack
  • Characteristics of a Queue
  • Applications of Queue
  • Real-Life Applications of Queue
  • Characteristics of a Tree
  • Applications of Tree
  • Real-Life Applications of Tree
  • Characteristics of a Graph
  • Applications of Graph
  • Real-Life Applications of Graph

How Data Structure varies from Data Type:

We already have learned about data structure. Many times, what happens is that people get confused between data type and data structure. So let’s see a few differences between data type and data structure to make it clear.

Classification of Data Structure: 

Data structure has many different uses in our daily life. There are many different data structures that are used to solve different mathematical and logical problems. By using data structure, one can organize and process a very large amount of data in a relatively short period. Let’s look at different data structures that are used in different situations.   

Classification of Data Structure

Classification of Data Structure

  • Static data structure: Static data structure has a fixed memory size. It is easier to access the elements in a static data structure.  An example of this data structure is an array.
  • Dynamic data structure: In the dynamic data structure, the size is not fixed. It can be randomly updated during the runtime which may be considered efficient concerning the memory (space) complexity of the code.  Examples of this data structure are queue, stack, etc.
  • Non-linear data structure: Data structures where data elements are not placed sequentially or linearly are called non-linear data structures. In a non-linear data structure, we can’t traverse all the elements in a single run only.  Examples of non-linear data structures are trees and graphs.

Need Of Data structure :

The structure of the data and the synthesis of the algorithm are relative to each other. Data presentation must be easy to understand so the developer, as well as the user, can make an efficient implementation of the operation. Data structures provide an easy way of organizing, retrieving, managing, and storing data. Here is a list of the needs for data.

  • Data structure modification is easy. 
  • It requires less time. 
  • Save storage memory space. 
  • Data representation is easy. 
  • Easy access to the large database.  

An array is a linear data structure and it is a collection of items stored at contiguous memory locations. The idea is to store multiple items of the same type together in one place. It allows the processing of a large amount of data in a relatively short period. The first element of the array is indexed by a subscript of 0. There are different operations possible in an array, like Searching, Sorting, Inserting, Traversing, Reversing, and Deleting.

Array

Characteristics of an Array: 

An array has various characteristics which are as follows:

  • Arrays use an index-based data structure which helps to identify each of the elements in an array easily using the index.
  • If a user wants to store multiple values of the same data type, then the array can be utilized efficiently.
  • An array can also handle complex data structures by storing data in a two-dimensional array.
  • An array is also used to implement other data structures like Stacks, Queues, Heaps, Hash tables, etc.
  • The search process in an array can be done very easily.

Operations performed on array:

  • Initialization : An array can be initialized with values at the time of declaration or later using an assignment statement.
  • Accessing elements: Elements in an array can be accessed by their index, which starts from 0 and goes up to the size of the array minus one.
  • Searching for elements : Arrays can be searched for a specific element using linear search or binary search algorithms.
  • Sorting elements : Elements in an array can be sorted in ascending or descending order using algorithms like bubble sort, insertion sort, or quick sort.
  • Inserting elements: Elements can be inserted into an array at a specific location, but this operation can be time-consuming because it requires shifting existing elements in the array.
  • Deleting elements: Elements can be deleted from an array by shifting the elements that come after it to fill the gap.
  • Updating elements: Elements in an array can be updated or modified by assigning a new value to a specific index.
  • Traversing elements: The elements in an array can be traversed in order, visiting each element once.

These are some of the most common operations performed on arrays. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used.

Applications of Array: 

Different applications of an array are as follows:

  • An array is used in solving matrix problems.
  • Database records are also implemented by an array.
  • It helps in implementing a sorting algorithm.
  • It is also used to implement other data structures like Stacks, Queues, Heaps, Hash tables, etc.
  • An array can be used for CPU scheduling.
  • Can be applied as a lookup table in computers.
  • Arrays can be used in speech processing where every speech signal is an array.
  • The screen of the computer is also displayed by an array. Here we use a multidimensional array. 
  • The array is used in many management systems like a library, students, parliament, etc. 
  • The array is used in the online ticket booking system. Contacts on a cell phone are displayed by this array. 
  • In games like online chess, where the player can store his past moves as well as current moves. It indicates a hint of position. 
  • To save images in a specific dimension in the android Like 360*1200 

Real-Life Applications of Array: 

  • An array is frequently used to store data for mathematical computations.
  • It is used in image processing.
  • It is also used in record management.
  • Book pages are also real-life examples of an array.
  • It is used in ordering boxes as well.

Want to get started with arrays? You can try out our curated articles and lists for the best practice:

  • Introduction to Array Data Structure
  • Top 50 Array Coding Problems for Interviews
  • Practice Array problem on GeeksforGeeks  

Linked list: 

A linked list is a linear data structure in which elements are not stored at contiguous memory locations. The elements in a linked list are linked using pointers as shown in the below image: 

Types of linked lists:

  • Singly-linked list
  • Doubly linked list
  • Circular linked list
  • Doubly circular linked list

Linked List

  • Linked List

Characteristics of a Linked list: 

A linked list has various characteristics which are as follows:

  • A linked list uses extra memory to store links.
  • During the initialization of the linked list, there is no need to know the size of the elements.
  • Linked lists are used to implement stacks, queues, graphs, etc.
  • The first node of the linked list is called the Head.
  • The next pointer of the last node always points to NULL.
  • In a linked list, insertion and deletion are possible easily.
  • Each node of the linked list consists of a pointer/link which is the address of the next node.
  • Linked lists can shrink or grow at any point in time easily.

Operations performed on Linked list:

A linked list is a linear data structure where each node contains a value and a reference to the next node. Here are some common operations performed on linked lists:

  • Initialization: A linked list can be initialized by creating a head node with a reference to the first node. Each subsequent node contains a value and a reference to the next node.
  • Inserting elements: Elements can be inserted at the head, tail, or at a specific position in the linked list.
  • Deleting elements : Elements can be deleted from the linked list by updating the reference of the previous node to point to the next node, effectively removing the current node from the list.
  • Searching for elements : Linked lists can be searched for a specific element by starting from the head node and following the references to the next nodes until the desired element is found.
  • Updating elements : Elements in a linked list can be updated by modifying the value of a specific node.
  • Traversing elements: The elements in a linked list can be traversed by starting from the head node and following the references to the next nodes until the end of the list is reached.
  • Reversing a linked list : The linked list can be reversed by updating the references of each node so that they point to the previous node instead of the next node.

These are some of the most common operations performed on linked lists. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used.

Applications of the Linked list: 

Different applications of linked lists are as follows:

  • Linked lists are used to perform arithmetic operations on long integers.
  • It is used for the representation of sparse matrices.
  • It is used in the linked allocation of files.
  • It helps in memory management.
  • It is used in the representation of Polynomial Manipulation where each polynomial term represents a node in the linked list.
  • Linked lists are used to display image containers. Users can visit past, current, and next images.
  • They are used to store the history of the visited page.
  • They are used to perform undo operations.
  • Linked are used in software development where they indicate the correct syntax of a tag.
  • Linked lists are used to display social media feeds. 

Real-Life Applications of a Linked list: 

  • A linked list is used in Round-Robin scheduling to keep track of the turn in multiplayer games.
  • It is used in image viewer. The previous and next images are linked, and hence can be accessed by the previous and next buttons.
  • In a music playlist, songs are linked to the previous and next songs. 

Want to get started with a linked list? You can try out our curated articles and lists for the best practice:

  • Introduction to Linked list Data Structure
  • Top 20 Linked list Interview Questions
  • Practice Linked List problem on GeeksforGeeks  

Stack is a linear data structure that follows a particular order in which the operations are performed. The order is LIFO(Last in first out) . Entering and retrieving data is possible from only one end. The entering and retrieving of data is also called push and pop operation in a stack. There are different operations possible in a stack like reversing a stack using recursion, Sorting, Deleting the middle element of a stack, etc. 

data structures for assignment

Characteristics of a Stack: 

Stack has various different characteristics which are as follows:

  • Stack is used in many different algorithms like Tower of Hanoi, tree traversal, recursion, etc.
  • Stack is implemented through an array or linked list.
  • It follows the Last In First Out operation i.e., an element that is inserted first will pop in last and vice versa.
  • The insertion and deletion are performed at one end i.e. from the top of the stack.
  • In stack, if the allocated space for the stack is full, and still anyone attempts to add more elements, it will lead to stack overflow.

Applications of Stack: 

Different applications of Stack are as follows:

  • The stack data structure is used in the evaluation and conversion of arithmetic expressions.
  • It is used for parenthesis checking.
  • While reversing a string, the stack is used as well.
  • Stack is used in memory management.
  • It is also used for processing function calls. 
  • The stack is used to convert expressions from infix to postfix. 
  • The stack is used to perform undo as well as redo operations in word processors. 
  • The stack is used in virtual machines like JVM. 
  • The stack is used in the media players. Useful to play the next and previous song. 
  • The stack is used in recursion operations. 

Operation performed on stack ;

A stack is a linear data structure that implements the Last-In-First-Out (LIFO) principle. Here are some common operations performed on stacks:

  • Push : Elements can be pushed onto the top of the stack, adding a new element to the top of the stack.
  • Pop : The top element can be removed from the stack by performing a pop operation, effectively removing the last element that was pushed onto the stack.
  • Peek: The top element can be inspected without removing it from the stack using a peek operation.
  • IsEmpty : A check can be made to determine if the stack is empty.
  • Size : The number of elements in the stack can be determined using a size operation.

These are some of the most common operations performed on stacks. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used. Stacks are commonly used in applications such as evaluating expressions, implementing function call stacks in computer programs, and many others.

Real-Life Applications of Stack: 

  • Real life example of a stack is the layer of eating plates arranged one above the other. When you remove a plate from the pile, you can take the plate to the top of the pile. But this is exactly the plate that was added most recently to the pile. If you want the plate at the bottom of the pile, you must remove all the plates on top of it to reach it.  
  • Browsers use stack data structures to keep track of previously visited sites.
  • Call log in mobile also uses stack data structure. 

Want to get started with Stack? You can try out our curated articles and lists for the best practice:

  • Introduction to Stack Data Structure
  • Practice Stack problem on GeeksforGeeks  

Queue is a linear data structure that follows a particular order in which the operations are performed. The order is First In First Out(FIFO) i.e. the data item stored first will be accessed first. In this, entering and retrieving data is not done from only one end. An example of a queue is any queue of consumers for a resource where the consumer that came first is served first. Different operations are performed on a Queue like Reversing a Queue (with or without using recursion), Reversing the first K elements of a Queue, etc. A few basic operations performed In Queue are enqueue, dequeue, front, rear, etc.

data structures for assignment

Characteristics of a Queue: 

The queue has various different characteristics which are as follows:

  • The queue is a FIFO (First In First Out) structure.
  • To remove the last element of the Queue, all the elements inserted before the new element in the queue must be removed.
  • A queue is an ordered list of elements of similar data types.

Applications of Queue: 

Different applications of Queue are as follows:

  • Queue is used for handling website traffic.
  • It helps to maintain the playlist in media players.
  • Queue is used in operating systems for handling interrupts.
  • It helps in serving requests on a single shared resource, like a printer, CPU task scheduling, etc.
  • It is used in the asynchronous transfer of data e.g. pipes, file IO, and sockets.
  •  Queues are used for job scheduling in the operating system.
  • In social media to upload multiple photos or videos queue is used.
  • To send an e-mail queue data structure is used.
  • To handle website traffic at a time queues are used.
  • In Windows operating system, to switch multiple applications.

Operation performed on queue:

A queue is a linear data structure that implements the First-In-First-Out (FIFO) principle. Here are some common operations performed on queues:

  • Enqueue : Elements can be added to the back of the queue, adding a new element to the end of the queue.
  • Dequeue : The front element can be removed from the queue by performing a dequeue operation, effectively removing the first element that was added to the queue.
  • Peek : The front element can be inspected without removing it from the queue using a peek operation.
  • IsEmpty : A check can be made to determine if the queue is empty.
  • Size : The number of elements in the queue can be determined using a size operation.

These are some of the most common operations performed on queues. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used. Queues are commonly used in applications such as scheduling tasks, managing communication between processes, and many others.

Real-Life Applications of Queue: 

  • A real-world example of a queue is a single-lane one-way road, where the vehicle that enters first will exit first. 
  • A more real-world example can be seen in the queue at the ticket windows.
  • A cashier line in a store is also an example of a queue.
  • People on an escalator 

Want to get started with Queue? You can try out our curated articles and lists for the best practice:

  • Introduction to Queue Data Structure
  • Practice Queue problem on GeeksforGeeks  

A tree is a non-linear and hierarchical data structure where the elements are arranged in a tree-like structure. In a tree, the topmost node is called the root node. Each node contains some data, and data can be of any type. It consists of a central node, structural nodes, and sub-nodes which are connected via edges. Different tree data structures allow quicker and easier access to the data as it is a non-linear data structure. A tree has various terminologies like Node, Root, Edge, Height of a tree, Degree of a tree, etc. 

There are different types of Tree-like 

  • Binary Tree , 
  • Binary Search Tree , 
  • AVL Tree , 
  • B-Tree, etc.

Tree

Characteristics of a Tree: 

The tree has various different characteristics which are as follows:

  • A tree is also known as a Recursive data structure.
  • In a tree, the Height of the root can be defined as the longest path from the root node to the leaf node.
  • In a tree, one can also calculate the depth from the top to any node. The root node has a depth of 0.

Applications of Tree: 

Different applications of Tree are as follows:

  • Heap is a tree data structure that is implemented using arrays and used to implement priority queues.
  • B-Tree and B+ Tree are used to implement indexing in databases.
  • Syntax Tree helps in scanning, parsing, generation of code, and evaluation of arithmetic expressions in Compiler design.
  • K-D Tree is a space partitioning tree used to organize points in K-dimensional space.
  • Spanning trees are used in routers in computer networks.

Operation performed on tree:

A tree is a non-linear data structure that consists of nodes connected by edges. Here are some common operations performed on trees:

  • Insertion : New nodes can be added to the tree to create a new branch or to increase the height of the tree.
  • Deletion : Nodes can be removed from the tree by updating the references of the parent node to remove the reference to the current node.
  • Search : Elements can be searched for in a tree by starting from the root node and traversing the tree based on the value of the current node until the desired node is found.
  • Traversal : The elements in a tree can be traversed in several different ways, including in-order, pre-order, and post-order traversal.
  • Height : The height of the tree can be determined by counting the number of edges from the root node to the furthest leaf node.
  • Depth : The depth of a node can be determined by counting the number of edges from the root node to the current node.
  • Balancing : The tree can be balanced to ensure that the height of the tree is minimized and the distribution of nodes is as even as possible.

These are some of the most common operations performed on trees. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used. Trees are commonly used in applications such as searching, sorting, and storing hierarchical data.

Real-Life Applications of Tree: 

  • In real life, tree data structure helps in Game Development. 
  • It also helps in indexing in databases. 
  • A Decision Tree is an efficient machine-learning tool, commonly used in decision analysis. It has a flowchart-like structure that helps to understand data.
  • Domain Name Server also uses a tree data structure.
  • The most common use case of a tree is any social networking site. 

Want to get started with Tree? You can try out our curated articles and lists for the best practice:

  • Introduction to Tree Data Structure
  • Top 50 Tree Interview Questions
  • Practice Tree problem on GeeksforGeeks 

A graph is a non-linear data structure that consists of vertices (or nodes) and edges. It consists of a finite set of vertices and set of edges that connect a pair of nodes. The graph is used to solve the most challenging and complex programming problems. It has different terminologies which are Path, Degree, Adjacent vertices, Connected components, etc.

Graph

Characteristics of Graph: 

The graph has various different characteristics which are as follows:

  • The maximum distance from a vertex to all the other vertices is considered the Eccentricity of that vertex.
  • The vertex having minimum Eccentricity is considered the central point of the graph.
  • The minimum value of Eccentricity from all vertices is considered the radius of a connected graph.

Applications of Graph: 

Different applications of Graphs are as follows:

  • The graph is used to represent the flow of computation.
  • It is used in modeling graphs.
  • The operating system uses Resource Allocation Graph.
  • Also used in the World Wide Web where the web pages represent the nodes.

Operation performed on Graph:

A graph is a non-linear data structure consisting of nodes and edges. Here are some common operations performed on graphs:

  • Add Vertex: New vertices can be added to the graph to represent a new node.
  • Add Edge: Edges can be added between vertices to represent a relationship between nodes.
  • Remove Vertex : Vertices can be removed from the graph by updating the references of adjacent vertices to remove the reference to the current vertex.
  • Remove Edge : Edges can be removed by updating the references of the adjacent vertices to remove the reference to the current edge.
  • Depth-First Search (DFS) : A graph can be traversed using a depth-first search by visiting the vertices in a depth-first manner.
  • B readth-First Search (BFS): A graph can be traversed using a breadth-first search by visiting the vertices in a breadth-first manner.
  • Shortest Path: The shortest path between two vertices can be determined using algorithms such as Dijkstra’s algorithm or A* algorithm.
  • Connected Components : The connected components of a graph can be determined by finding sets of vertices that are connected to each other but not to any other vertices in the graph.
  • Cycle Detection : Cycles in a graph can be detected by checking for back edges during a depth-first search.

These are some of the most common operations performed on graphs. The specific operations and algorithms used may vary based on the requirements of the problem and the programming language used. Graphs are commonly used in applications such as computer networks, social networks, and routing problems.

Real-Life Applications of Graph: 

  • One of the most common real-world examples of a graph is Google Maps where cities are located as vertices and paths connecting those vertices are located as edges of the graph.
  • A social network is also one real-world example of a graph where every person on the network is a node, and all of their friendships on the network are the edges of the graph.
  • A graph is also used to study molecules in physics and chemistry. 

Want to get started with Graph? You can try out our curated articles and lists for the best practice:

  • Introduction to Graph Data Structure
  • Top 50 Graph Interview Questions
  • Practice Graph problem on GeeksforGeeks 

Advantages of data structure:

  • Improved data organization and storage efficiency.
  • Faster data retrieval and manipulation.
  • Facilitates the design of algorithms for solving complex problems.
  • Eases the task of updating and maintaining the data.
  • Provides a better understanding of the relationships between data elements.  

Disadvantage of Data Structure:

  • Increased computational and memory overhead.
  • Difficulty in designing and implementing complex data structures.
  • Limited scalability and flexibility.
  • Complexity in debugging and testing.
  • Difficulty in modifying existing data structures.

Data structures can be found in various computer science textbooks and online resources. Some popular texts include:

  • “Introduction to Algorithms” by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
  • “Data Structures and Algorithm Analysis in Java” by Mark Allen Weiss.
  • “The Algorithm Design Manual” by Steven S. Skiena.
  • Online resources such as Coursera, Udemy, and Khan Academy also offer courses on data structures and algorithms.

Although these are the most widely known and used data structures, there are some other forms of data structures as well which are used in Computer Science, such as policy-based data structures , etc. But no matter which data structure you choose, each one has its perks and disadvantages, without the knowledge of which, it can be very costly to choose the wrong type of data structure. So it is very important to understand the need of the situation, and then decide which kind of data structure suits best for the job.

  • DSA in Java
  • DSA in Python
  • DSA in JavaScript

Please Login to comment...

  • Data Structures
  • RishabhPrabhu
  • ManasChhabra2
  • reshmapatil2772
  • sriparnxnw7
  • avinashrayz28
  • anikettchavan
  • tarunsarawgi_gfg
  • 10 Best IPTV For TiviMate 2024
  • 10 Best AI Virtual Assistants to Make You More Productive in 2024
  • How To Add My Home In Google Map
  • Tom Cruise Net Worth 2024
  • 10 Best Free AI Art Generators to Create Image From Text [Free & Paid]

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

600.226: Data Structures (Spring 2017)

Assignment 4: stacking queues.

  • Out on: February 23, 2017
  • Due by: Thursday , March 2 before 10:00 pm
  • Collaboration: None
  • Grading: Packaging 10%, Style 10% (where applicable), Testing 10% (where applicable), Performance 10% (where applicable), Functionality 60% (where applicable)

The fourth assignment is mostly about stacks and queues. For the former you’ll build a simple calculator application, for the latter you’ll implement the data structure in a way that satisfies certain performance characteristics (in addition to the usual correctness properties).

Problem 1: Calculating Stacks (50%)

Your first task is to implement a basic RPN calculator that supports integer operands like 1 , 64738 , and -42 as well as the (binary) integer operators + , - , * , / , and % . The style of arithmetic expressions our calculator will evaluate is also called a post-fix notation. Stacks are great for doing this job! Your task is to write a driver (client) program that uses our Stack interface and one of the given implementations to perform these calculations as specified here.

Your program should be called Calc and work as follows:

  • The user enters input through System.in consisting of operands and operators, presumably in post-fix notation. We have also included some extra operators to get information about results and the current state of the stack.
  • If the user enters a valid integer, you push that integer onto the stack.
  • If the user enters a valid operator, you pop two integers off the stack, perform the requested operation, and push the result back onto the stack.
  • If the user enters the symbol ? (that’s a question mark), you print the current state of the stack using its toString method followed by a new line.
  • If the user enters the symbol ^ (that’s a caret), you pop the top element off the stack and print only that element (not the entire stack) followed by a new line.
  • If the user enters the symbol ! (that’s an exclamation mark or bang), you exit the program.

Note that there are a number of error conditions that your program must deal with gracefully for full credit. We’ll give you two examples for free, you’ll have to figure out any further error conditions for yourself:

  • If the user enters blah or 1.5 or anything else that doesn’t make sense for an integer calculator as specified above, your program should make it clear that it can’t do anything helpful with that input; but it should not stop at that point.
  • If the user requests an operation for which there are not enough operands on the stack, your program should notify the user of the problem but leave the stack unchanged; again, it should certainly not stop at that point.

Of course this means that you’ll have to print error messages to the user. Error messages must be printed to standard error and not to standard output! (Of course, the regular input and output is done through standard input and standard output as usual.) Furthermore, all error messages must start with the symbol # (that’s a hash sign) and be followed by a new line!

Here are two examples for interacting with Calc that will hopefully help you understand what you’re trying to achieve. First a “slow” example:

Here $ is the shell prompt. After starting the program, the first command was ? to print the stack (which is empty at this point, hence [] is the output). Then the user typed 10 followed by ? and we see that the stack now holds that number: [10] . Now the user typed two numbers 20 30 in sequence before hitting return. When we check the stack now using ? we get the answer [30, 20, 10] so obviously the “top” of the stack is to the left. Then we see the * operator being typed, which will multiply the top two numbers. We use ? again to check the result: [600, 10] . This is followed by the + operator, which will add the top two numbers. Again we check with ? and get [610] as we’d expect. The ^ command prints the same result 610 and pops if off the stack. So the next ? shows an empty stack again. Finally the user typed the ! command to quit, returning us to the shell. Here’s the same example, done “fast” this time:

As you can see, if the entire sequence of integers, operators, and commands is entered on a single line, they are all executed in order. It’s like having our own little programming language! Finally, here’s an example for the sample error conditions described above:

Note in particular that blah and 1.0 lead to error messages but are otherwise ignored (the program doesn’t stop); same for the two + operations when the stack only has a single element (the program doesn’t even modify the stack in that case).

Implementation Details and Hints

  • You must create an empty Stack to hold intermediate results and then repeatedly accept input from the user. It doesn’t matter whether you use the ArrayStack or the ListStack we provide, what does matter is that the specific type only appears once in your program. (In other words, the type of the stack reference variable you use in your program must be Stack and not ArrayStack or ListStack . Polymorphism!)
  • Note that we’re dealing with integers only (type Integer in Java) so / stands for integer division and % stands for integer remainder . Both of these should behave in Calc just like they do in Java. The details are messy but worth knowing about, especially regarding modulus .
  • You may find it interesting to read up on Autoboxing and Unboxing in Java. It’s the reason we can use our generic Stack implementations for Integer objects yet still do arithmetic like we would on regular int variables.
  • Only if you’re not afraid of learning on your own: You’ll be able to use the matches method of the String class to your advantage when it comes to checking whether a valid operator was entered. (But you can just as well do it with a bunch of separate comparisons or a simple String variable containing all the valid operation symbols if you don’t want to learn about regular expressions .)

Problem 2: Hacking Growable Deques (50%)

Your second task is to implement a generic ArrayDeque class as outlined in lecture. As is to be expected, ArrayDeque must implement the Deque interface we provided on Piazza .

Your implementation must be done in terms of an array that grows by doubling as needed. It’s up to you whether you want to use a built-in Java array or the SimpleArray class you know and love; just in case you prefer the latter, we’ve once again included it on the Piazza post for this assignment. Your initial array must have a length of one slot only! (Trust us, that’s going to make debugging the “doubling” part a lot easier.)

Your implemention must support all Deque operations except insertion in (worst-case) constant time ; insertion can take longer when you need to grow the array, but overall all insertion operations must be constant amortized time as discussed in lecture.

You should provide a toString method in addition to the methods required by the Deque interface. The toString will orient the front of the deque at the left and the back at the right. For example, a new dequeue into which 1, 2, and 3 were inserted using insertBack() should print as [1, 2, 3] whereas an empty dequeue should print as [] .

You must write JUnit 4 test drivers for the Deque interface and your ArrayDeque class. All the general test cases should go into DequeTestBase.java whereas test cases specific to ArrayDeque (if any!) should go into ArrayDequeTest.java . (Follow the example for testing the Array interface and its various implementations we posted on Piazza and discussed in lecture.)

Be sure to test all methods and all exceptions as well. Note that it is not enough to have just one test case for each method; there are plenty of complex interactions between the methods that need to be covered as well. (And yes, of course you need to test toString !)

Documentation

Don’t forget to add proper javadoc comments for your ArrayDeque class. Running checkstyle will remind you to do this!

General Assignment Hints

  • Ensure that the version of your code you hand in does not produce any extraneous debugging output anymore!
  • Pay attention to edge cases in the input your classes and programs are expected to handle! For example, make sure that you handle the empty input in a reasonable way for Problem 1.
  • Private helper methods are your friends. Your best friends, actually! If you don’t write plenty of them, you’ll have a much harder time getting your code to work.

Bonus Problem (0%)

Develop an algebraic specification for the abstract data type Queue . Use new , empty , enqueue , dequeue , and front (with the meaning of each as discussed in lecture) as your set of operations. Consider unbounded queues only, unless of course you want to do a bonus bonus problem on bounded queues as well.

The difficulty is going to be modelling the FIFO (first-in-first-out) behavior accurately. You’ll probably need at least one axiom with a case distinction using an if expression; the syntax for this in the Array specification for example.

Doing this problem without resorting to Google may be rather helpful for the upcoming midterm. There’s no need to submit the problem, but of course you can submit it if you wish; just include it at the end of your README file.

Deliverables

You must turn in a zipped ( .zip only) archive containing all source code, your README file, and any other deliverables required by the assignment. The filename should be HW##-jhed.zip with ## replaced by the 2-digit number (use leading 0s) of this assignment (see above) and jhed replaced by your Blackboard login. (For example, Peter would use HW03-pfroehl1.zip for his submission of Assignment 3.) The zip should contain no derived files whatsoever (i.e. no .class files, no .html files, etc.), but should allow building all derived files. Include a plain text README file (not README.txt or README.docx or whatnot) that briefly explains what your programs do and contains any other notes you want us to check out before grading. Your answers to written problems should be in this README file as well. Finally, make sure to include your name and email address in every file you turn in (well, in every file for which it makes sense to do so anyway)!

For reference, here is a short explanation of the grading criteria; some of the criteria don't apply to all problems, and not all of the criteria are used on all assignments.

Packaging refers to the proper organization of the stuff you hand in, following both the guidelines for Deliverables above as well as the general submission instructions for assignments.

Style refers to Java programming style, including things like consistent indentation, appropriate identifier names, useful comments, suitable javadoc documentation, etc. Many aspects of this are enforced automatically by Checkstyle when run with the configuration file available on Piazza . Style also includes proper modularization of your code (into interfaces, classes, methods, using public , protected , and private appropriately, etc.). Simple, clean, readable code is what you should be aiming for.

Testing refers to proper unit tests for all of the data structure classes you developed for this assignment, using the JUnit 4 framework as introduced in lecture. Make sure you test all (implied) axioms that you can think of and all exception conditions that are relevant.

Performance refers to how fast/with how little memory your program can produce the required results compared to other submissions.

Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README file.

If your programs cannot be built you will get no points whatsoever. If your programs cannot be built without warnings using javac -Xlint:all we will take off 10% (except if you document a very good reason; no, you cannot use the @SuppressWarnings annotation either). If your programs fail miserably even once, i.e. terminate with an exception of any kind, we will take off 10% (however we'll also take those 10% off if you're trying to be "excessively smart" by wrapping your whole program into a universal try-catch).

Learn Algorithms and Data Structures in Python

Algorithms and data structures are important for most programmers to understand.

We just released a course on the freeCodeCamp YouTube channel that is a beginner-friendly introduction to common data structures (linked lists, stacks, queues, graphs) and algorithms (search, sorting, recursion, dynamic programming) in Python.

This course will help you prepare for coding interviews and assessments. In this course, you will:

  • Watch live hands-on coding-focused video tutorials
  • Practice coding with cloud Jupyter notebooks
  • Solve questions from real programming interviews

Aakash N S teaches this course. He is the co-founder and CEO of Jovian and has created many popular courses about machine learning and programming.

The course is broken up into a series of lessons, assignments, and projects. There are Jupyter Notebook files to go along with each section.

Here is what is covered in the course:

Lesson 1 - Binary Search, Linked Lists and Complexity

  • Linear and Binary Search
  • Complexity and Big O Notation
  • Linked Lists using Python Classes

Assignment 1 - Binary Search Practice

  • Understand and solve a problem systematically
  • Implement linear search and analyze it
  • Optimize the solution using binary search

Lesson 2 - Binary Search Trees, Traversals and Recursion

  • Binary trees, traversals, and recursion
  • Binary search trees & common operations
  • Balanced binary trees and optimizations

Assignment 2 - Hash Tables and Python Dictionaries

  • Hash tables from scratch in Python
  • Handling collisions using linear probing
  • Replicating Python dictionaries

Lesson 3 - Sorting Algorithms and Divide & Conquer

  • Bubble sort and Insertion Sort
  • Merge sort using Divide & Conquer
  • Quicksort and average complexity

Assignment 3 - Divide and Conquer Practice

  • Implement polynomial multiplication
  • Optimize using divide and conquer
  • Analyze time and space complexity

Lesson 4 - Recursion and Dynamic Programming

  • Recursion and memoization
  • Subsequence and knapsack problems
  • Backtracking and pruning

Lesson 5 - Graph Algorithms (BFS, DFS & Shortest Paths)

  • Graphs, trees, and adjacency lists
  • Breadth-first and depth-first search
  • Shortest paths and directed graphs

Project - Step-by-Step Solution to a Programming Problem

  • Pick an interesting coding problem
  • Solve the problem step-by-step
  • Document and present the solution

Lesson 6 - Python Interview Questions, Tips & Advice

  • Practice questions and solutions
  • Tips for solving coding challenges
  • Advice for cracking coding interviews

Watch the course below or on the freeCodeCamp.org YouTube channel (13-hour watch).

I'm a teacher and developer with freeCodeCamp.org. I run the freeCodeCamp.org YouTube channel.

If you read this far, thank the author to show them you care. Say Thanks

Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000 people get jobs as developers. Get started

CSCI 2720 : Data Structures

The design, analysis, and implementation of data structures and their associated algorithms; Lists; Stacks; Queues and Priority Queues; Trees; Graphs and Dictionaries; Time and Space Complexity; Sorting and Searching; Advanced problem-solving, and Introduction Algorithm Design Strategies.

Laptops are required. You must bring your own laptop to class, equipped with the necessary software for in-class programming exercises.

Course Material & Textbook

We recommend the followingon-line textbooks (first one is not free).

  • Vital Source has the digital versions listed at $26.00 USD for a 150-day rental.
  • Algorithms, 4th Edition   by Robert Sedgewick and Kevin Wayne Link: here . This comprehensive textbook offers an in-depth exploration of algorithms and data structures, emphasizing practical applications and a scientific approach to understanding performance. It also have videos available.
  • Data Structures & Algorithm Analysis by Clifford A. Shaffer, 2013. Links: here , and pdf .
  • Data Structures and Algorithms in Java 2nd Edition by Robert Lafore 2003. This book, though dated, offers a solid foundation in data structures and algorithms using Java. Link: pdf

Prerequisites

CSCI 1302 : Software Development CSCI 2610 or CSCI 2610E or CSCI 2611): Discrete Mathematics

List of Technologies

Note you need a laptop (that you bring to class) to support the below technologies.

  • [odin] : We use our departmental server, odin.cs.uga.edu for testing and submitting programming projects. You may need to use a client such as mobaXterm or fileZilla to transfer files between you home computer and our odin server. Make sure you have access to Odin (this is part of our first homework assignment) If you encounter issues with access to Odin, please contact: ---> [email protected]
  • [elC ]: We will utilize eLC posting links to assignments, and post grades.
  • [Survey Monkey]: may be used for certain activities.
  • [Piazza] : We will be using Piazza as our main communication platform. It is essential that you keep up to date with Piazza; we assume you will check this platform every day.
  • [Screen Video] : For your assignments, you will may to create a demo video showcasing your work. Please submit a private link to this video as part of your submission for projects and select assignments.
  • [Java Version (on Odin) ]: For testing, we will use the Java version specified in /usr/local/bin/java on the odin.cs.uga.edu server, currently '17.0.3.1' (2022-04-22 LTS).
  • [IntelliJ IDE] We will use IntelliJ as our java IDE. You can get a free professional version (Ultimate) as a student, there is also a free consumer version (Community Edition) that can work as well.
  • [JUnit 5 Jupiter] : We will JUnit 5 for our testing framework.

Grading Policy and Evaluation:

You are evaluated on Assignments, Quizzes, and Exams. The distribution is below, but is subject to change.

NOTE: Syllabus & policies are solidified after 3rd week of class - but there still may be minor changes.

The general grading philosophy is the following (the context is "programs", "assignments" and "projects" but it should generalize to summaries and to other required elements). See conversion from percentage to letter grades finger grade in link further below.

A Extraordinary, goes beyond the minimum criteria both in depth and breath. Design is thorough and well thought out, code base and implementation is beautiful- modularized, complete, clear and concise. A well thought out debugging strategy is apparent. Well documented code. The evaluation plan is well executed and well thought out, goes beyond the simple cases and illustrates strong problem solving skills.

A- Superior (somewhat less than an A, almost perfect code, beautiful, concise, minimum criteria met, not as much depth and forethought as an A+ but it is superb), goes beyond meeting the minimum criteria demonstrates depth, can apply solution to a variation of similar problems.

B Good , minimum criteria well executed and well done, shows some depth and understanding, meets most test cases but does not demonstrate as much forethought as an A; [threshold].

C Fair , minimum criteria met , but could have been executed with more depth and forethought.

Grade Breakdown

We use the College Board's convention to convert from percent grades to letter grades grades:

(see here ).

Late Policy

Assignments, and summaries (if applicable to this class) are due on the date and time specified in the assignment or the day of demonstration (email as a time stamp and hand-in hardcopy the next day). Late homework cannot be accepted . There are some exceptions with mitigating circumstances - please discuss these with instructor before due dates. {before is important}. If late submissions become acceptable, assignments are subject to a 10 points deduction for every day late including weekends, & holidays.

Assignment Posting

  • Homework assignments, which may include both coding and analysis components, will be available on the scheduling page, accessible via the top menu bar link on our website. These can be accessed from anywhere with an Internet connection.
  • Coding components should be submitted to the odin server.
  • Analysis components should be submitted through a designated folder on eLC.
  • Please be aware that homework assignments might be posted before their official due dates. Until they are officially due, these materials are provisional and may be subject to changes, including potential additional requirements.
  • The primary programming language for this course is Java, and we will test your coding assignments on the odin server.

Assignment Quality

Each assignment must adhere to the following criteria:

  • Code Quality: Submissions should have clean, structured code following established coding styles.
  • Comments: Include clear comments explaining code functionality and logic.
  • File Relevance: Only necessary files are to be submitted; omit any extraneous files.

Failure to meet these standards will affect your assignment grade.

For assignments utilizing test cases, be aware that additional test cases, not provided on the assignment date, may be used to evaluate the robustness and quality of your programming. Prepare your code to handle a variety of scenarios.

Expected Class Workload

In a college level course, students are generally expected to dedicate approximately 2-3 hours of out-of-class work for each credit hour per week. The University of Georgia (and the federal definition of a credit hour) recommends a **minimum** of 2 hours of outside work per credit hour each week ( link ).

This equates to an average of 8-12 hours per week on course-related activities outside of class. In our course, we align with this standard and anticipate an average outside class workload of about 10 hours per week on average (in the middle of the standard).

All assignments are individual assignments unless explicitly specified.

Grade Appeals/Contests

Any request for grade adjustments or grade appeal to be made to any assignment including exams must be made within 7 days of when the assignment is returned via a private post on piazza to instructors. Adjustments will be made after 7 days have elapsed.

Communication

Please communicate via private post on piazza to instructors. We will try to respond 24-48 hours.

Participation and Attendance

We will take attendance (not every class period, but by sampling), and perfect (near) attendance will increase you grade. If you miss attendance excessively you may be withdrawn from the class - see below.

Instructor Enforced Withdrawals

Missing 7 days or more of classes ( inclusive of excused absences) is considered excessive absentia and may result in an instructor enforced withdrawal.

Academic Honesty

As a University of Georgia student, you have agreed to abide by the University’s academic honesty policy, “A Culture of Honesty,”and the Student Honor Code. All academic work must meet the standards described in “A Culture of Honesty” found at:

http://www.uga.edu/honesty .

Lack of knowledge of the academic honesty policy is not a reasonable explanation for a violation. Questions related to course assignments and the academic honesty policy should be directed to the instructor.

The Computer Science Department recognizes honesty and integrity as necessary to the academic function of the University. Therefore all students are reminded that the CS faculty requires compliance with the conduct regulations found in the University of Georgia Student Handbook.  Academic honesty means that any work you submit is your own work.

Common forms of academic dishonesty, which students should guard against, are:

  • copying from another student’s test paper or laboratory report, or allowing another student to copy from you;
  • fabricating data (computer, statistical) for an assignment;
  • helping another student to write a laboratory report or computer software code that the student will present as his own work, or accepting such help and presenting the work as your own;
  • turning in material from a public source such as a book or the Internet as your own work.

Three steps to help prevent academic dishonesty are:

  • Familiarize yourself with the regulations.
  • If you have any doubt about what constitutes academic dishonesty, ask your instructor or a staff member at the Office of the Vice President for Instruction.
  • Refuse to assist students who want to cheat.

In addition to the terms expressed above, you also agree not to make any portion of your assignments for this class {publicly or non-publicly} available for others to view .

This includes, but is not limited to:

1)  posting snippets of your code  on help websites.

2) Engaging in activities similar to this will be seen as either giving or receiving unauthorized assistance.

3) With regard to question and answer websites (e.g., StackOverflow, Yahoo Answers, etc.), you may ask  general questions  about programming on such websites that relate to your assignments in this class, however,  you must phrase such questions in a way that make them independent of the specific problem you are having.

If you need specific help with portions of your code ( while displaying your code ), then you must consult with the instructor or teaching assistants first (unless expressly and explicitly stated otherwise in the assignment description). Furthermore, if you copy or extend material from the Web (in any fashion) or other sources and incorporate that material into the submission for one of your assignments then  you must cite  where you got the code from in order to avoid plagiarism .

All faculty, staff and students are encouraged to report all suspected cases of academic dishonesty. All cases of suspected academic dishonesty (cheating) will be referred to the Office of the Vice President for Instruction for academic dishonesty. Penalties imposed by the Office of the Vice President for Instruction may include a failing grade in the course and a notation on the student’s transcript. Repeated violations are punishable by expulsion from the University.

Decorum: Expectations for In-Class and Online Behavior

Students are expected to maintain a high standard of courtesy and respect during all interactions within the class community, including peers, teaching assistants, and the instructor. This expectation applies to both in-class and on-line engagements (including social media).

Behavior that is disruptive or disrespectful may lead to a student being requested to leave the classroom. In severe or persistent cases, further actions may include the filing of a formal report by the instructor or potential withdrawal of the student from the class.

Other Class Policies

If you miss class it is your responsibility to find out what you missed if you don't attend class. Please do not email instructor and ask what you have missed - you may ask however after class.

You are required to subscribe to the class email list or forum (Piazza)- and be aware that sometimes homework is assigned or templates are distributed via this forum - or there are some hints posted in the forum - you probably do not want to miss that.

The purpose of the assignments are familiarization of concepts and details of programming. Assignments are expected to be individual work unless other specified. However, you are encouraged to ask questions of one another, and to respond to other student's questions on the email list.

Direct exchange of code is prohibited, as is line-by-line assistance from anyone or thing (this includes Internet copying) . This is checked for every assignments. If you do get help from other sources than book, slides, or TA you must acknowledge the source in the material that you turn in -but note still no copying is allowed.

Unless otherwise specified, exams are closed-book and no additional materials may be used. Missing an exam: absence due to serious illness will be an acceptable reason for missing an exam. Doctor's diagnostic note is required (these absences may be confirmed with your doctor). The final grade will be scaled accordingly.

COVID - Syllabus Addendum (we follow the institute regulation regarding COVID):

https://ovpi.uga.edu/_resources/documents/Syllabi-Info-for-Students-Fall-2021-8.6.2021.pdf

(please check Arch NEWS for updates regarding the UGA COVID policy).

Contributions:

Material is drawn from UGA previous course designers, and instructors.

data structures for assignment

Online Embedded Systems Course with Placements

Advanced Embedded Course With Placements

Advanced Embedded Course With Placements

Online Embedded IoT Course

Online Embedded IoT Course

Advanced Embedded IoT Course With Placements

Advanced Embedded IoT Course With Placements

Linux Device Drivers

Linux Device Drivers

Embedded

IoT Internship

Campus Ambassador Program

Campus Ambassador Program

  • For Corporates
  • All Courses
  • Hire Trainees
  • Short-term Courses

Schedule a Call

With Our Career Counsellor

Invalid value

  • Register now!
  • Online Degree Explore Bachelor’s & Master’s degrees
  • MasterTrack™ Earn credit towards a Master’s degree
  • University Certificates Advance your career with graduate-level learning
  • Top Courses
  • Join for Free

Stanford University

Graph Search, Shortest Paths, and Data Structures

This course is part of Algorithms Specialization

Taught in English

Some content may not be translated

Tim Roughgarden

Instructor: Tim Roughgarden

Financial aid available

83,796 already enrolled

(1,956 reviews)

Skills you'll gain

  • Data Structure

Details to know

data structures for assignment

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Placeholder

Build your subject-matter expertise

  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

There are 4 modules in this course

The primary topics in this part of the specialization are: data structures (heaps, balanced search trees, hash tables, bloom filters), graph primitives (applications of breadth-first and depth-first search, connectivity, shortest paths), and their applications (ranging from deduplication to social network analysis).

Breadth-first and depth-first search; computing strong components; applications.

What's included

9 videos 4 readings 2 quizzes

9 videos • Total 162 minutes

  • Graph Search - Overview • 23 minutes • Preview module
  • Breadth-First Search (BFS): The Basics • 14 minutes
  • BFS and Shortest Paths • 7 minutes
  • BFS and Undirected Connectivity • 13 minutes
  • Depth-First Search (DFS): The Basics • 7 minutes
  • Topological Sort • 21 minutes
  • Computing Strong Components: The Algorithm • 29 minutes
  • Computing Strong Components: The Analysis • 26 minutes
  • Structure of the Web [Optional] • 18 minutes

4 readings • Total 40 minutes

  • Week 1 Overview • 10 minutes
  • Overview, Resources, and Policies • 10 minutes
  • Lecture slides • 10 minutes
  • Optional Theory Problems (Week 1) • 10 minutes

2 quizzes • Total 60 minutes

  • Problem Set #1 • 30 minutes
  • Programming Assignment #1 • 30 minutes

Dijkstra's shortest-path algorithm.

4 videos 2 readings 2 quizzes

4 videos • Total 79 minutes

  • Dijkstra's Shortest-Path Algorithm • 20 minutes • Preview module
  • Dijkstra's Algorithm: Examples • 12 minutes
  • Correctness of Dijkstra's Algorithm • 19 minutes
  • Dijkstra's Algorithm: Implementation and Running Time • 26 minutes

2 readings • Total 20 minutes

  • Week 2 Overview • 10 minutes
  • Optional Theory Problems (Week 2) • 10 minutes
  • Problem Set #2 • 30 minutes
  • Programming Assignment #2 • 30 minutes

Heaps; balanced binary search trees.

9 videos 1 reading 2 quizzes

9 videos • Total 141 minutes

  • Data Structures: Overview • 4 minutes • Preview module
  • Heaps: Operations and Applications • 18 minutes
  • Heaps: Implementation Details [Advanced - Optional] • 20 minutes
  • Balanced Search Trees: Operations and Applications • 10 minutes
  • Binary Search Tree Basics, Part I • 13 minutes
  • Binary Search Tree Basics, Part II • 30 minutes
  • Red-Black Trees • 21 minutes
  • Rotations [Advanced - Optional] • 7 minutes
  • Insertion in a Red-Black Tree [Advanced] • 14 minutes

1 reading • Total 10 minutes

  • Week 3 Overview • 10 minutes
  • Problem Set #3 • 30 minutes
  • Programming Assignment #3 • 30 minutes

Hashing; bloom filters.

9 videos 3 readings 3 quizzes

9 videos • Total 171 minutes

  • Hash Tables: Operations and Applications • 19 minutes • Preview module
  • Hash Tables: Implementation Details, Part I • 18 minutes
  • Hash Tables: Implementation Details, Part II • 22 minutes
  • Pathological Data Sets and Universal Hashing Motivation • 21 minutes
  • Universal Hashing: Definition and Example [Advanced - Optional] • 25 minutes
  • Universal Hashing: Analysis of Chaining [Advanced - Optional] • 18 minutes
  • Hash Table Performance with Open Addressing [Advanced - Optional] • 15 minutes
  • Bloom Filters: The Basics • 15 minutes
  • Bloom Filters: Heuristic Analysis • 13 minutes

3 readings • Total 30 minutes

  • Week 4 Overview • 10 minutes
  • Optional Theory Problems (Week 4) • 10 minutes
  • Info and FAQ for final exam • 10 minutes

3 quizzes • Total 90 minutes

  • Problem Set #4 • 30 minutes
  • Programming Assignment #4 • 30 minutes
  • Final Exam • 30 minutes

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

data structures for assignment

The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is an American private research university located in Stanford, California on an 8,180-acre (3,310 ha) campus near Palo Alto, California, United States.

Recommended if you're interested in Algorithms

data structures for assignment

Stanford University

Shortest Paths Revisited, NP-Complete Problems and What To Do About Them

Greedy algorithms, minimum spanning trees, and dynamic programming.

data structures for assignment

Divide and Conquer, Sorting and Searching, and Randomized Algorithms

data structures for assignment

Specialization

Why people choose Coursera for their career

data structures for assignment

Learner reviews

Showing 3 of 1956

1,956 reviews

Reviewed on Dec 28, 2019

I am very confident in the skills I learned. I have read some books on algorithms but this course makes the application so clear regardless of your programing language.

Reviewed on Apr 9, 2020

The best algorithms course available. More on the theoretical side which in my opinion is more important, if theory is understood, implementation becomes second nature.

Reviewed on Aug 2, 2021

Very well put together course. Problem sets are tricky; however, by the end of the course, you feel quite accomplished and comfortable dealing with graphs and basic data structures.

New to Algorithms? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions

When will i have access to the lectures and assignments.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

What is the refund policy?

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy Opens in a new tab .

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

More questions

Search code, repositories, users, issues, pull requests...

Provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

This repository will host all the code the that I create in my Data Structures class. In this class I hope to improve on commenting and general documentation

Ykphill/Assignment-1-OOP-and-Java-Classes

  • Java 100.0%

Human Capital Management Structures

After completing this lesson, you will be able to:

  • Describe the differences of the enterprise, personnel, and organizational structures within HCM

An HCM system enables you to set up three integrated structures, the enterprise structure, the personnel structure and the organizational structure.

data structures for assignment

You must be able to evaluate and report on employee data from all enterprise-specific organizational aspects. Every employee is included in the structure of the company.

Integrated HCM Structure

The three structures in HCM are:

  • Enterprise structure
  • Personnel structure
  • Organizational structure

The allocation of employees to the structures in their enterprise is of the utmost importance in Human Resources. In a Personnel Action, the first step is to enter the data for these three structures in IT0001 Organizational Assignment.

data structures for assignment

The employee is assigned, for example, to a company code, a personnel area, and a personnel sub area. You also assign employees to positions, which results in the employee's assignment to an organizational unit, a job, and a cost center.

Enterprise Structure

The enterprise structure for personnel administration is determined by the client, company code, personnel area, and personnel subarea.

data structures for assignment

An employee’s assignment to the enterprise structure results in the employee being assigned to the following elements:

A client is a self-contained unit within the system. It can either be valid for a company code at the smallest level, or for the entire corporate group. You should consider the following points before you decide to set up a client:

  • There is usually no exchange of data between clients.
  • If an employee changes clients, you must create the personnel number again.

A company code is a self-contained unit in legal terms, for which you can draw up a complete set of accounts. Legally required financial statements, such as balance sheets and profit and loss statements, are created at the company code level. The company code is the highest level of the company structure and is defined in Accounting.

A personnel area is used exclusively in personnel administration. Each personnel area must be assigned to a company code and represents a subdivision of the company code. The individual personnel areas in a company code have 4-digit alphanumeric keys.

A personnel subarea is a further subdivision of the personnel area. The organization of the most important subareas of personnel administration takes place at this level. The principal organizational aspects of human resources, such as the pay scale and wage type structures, and the planning of work schedules are controlled at this level. The personnel subarea is assigned a 4-character alphanumeric key.

Personnel Structure

An employee’s assignment to the personnel structure results in an assignment to structures specific for personnel administration. These assignments include the employee group and employee subgroup. Employees are also assigned to a payroll area. These assignments are also relevant for time management and payroll.

Employee Group

An employee group is a general division of employees. The group defines the relationship between an employee and a company based on how the employee contributes to the company in terms of work. Active employees, pensioners, and early retirees comprise the main employee groups in personnel administration.

data structures for assignment

The following are the principal functions of the employee group:

  • It can determine the default values for the payroll accounting area or basic pay.
  • It is used as a selection criterion for reporting.
  • It is one unit of the authorization check.

Employee Subgroup

An employee subgroup is a division of an employee group according to contract conditions. Wage earners, salaried employees, and non–pay scale employees are examples of subgroups within the active employee group. All control features of the personnel structure are defined at the employee subgroup level.

data structures for assignment

The following features are some of the most important customizing settings for time management and payroll calculation:

  • The employee subgroup for the payroll allows you to define different payroll procedures for different employee subgroups. For example, you can specify whether an employee’s pay should be accounted on an hourly or monthly basis.
  • When entering data, you can define default values using the employee subgroup, for example, for the payroll accounting area.

Payroll Areas

The payroll area represents a group of employees used for running payroll. All employees who have their payroll run for them at the same time and for the same period are assigned to the same payroll area.

data structures for assignment

Payroll accounting is generally performed for each payroll area. The payroll area provides the payroll driver with two pieces of information, the number of employees to be accounted for and the dates of the payroll period. The number of employees to be accounted is determined using IT0001 Organizational Assignment, which stores the payroll area.

Infotype 0001 Organizational Assignment

Assignments to these HCM structures are reflected in the IT0001 Organizational Assignment for the employee. This infotype is one of the mandatory master data infotypes that must always exist for every employee. There can be no gaps or overlaps in the validity period for this infotype, and each record must be unique.

data structures for assignment

Information on this infotype is important for data entry, payroll, and time management. If this master data infotype does not exist, the employee is not set up in the HCM structure, so time management and the payroll component cannot be used.

Watch the video which explains the main areas of an organizational assignment record in HCM, including enterprise structure, personnel structure, and linking employees to organizational plans. 

Update of an Organizational Plan

Organizational Management is based on the concept that every element of a company constitutes a unique object with individual attributes. You create and maintain each object individually, and you create relationships between the various objects to form a framework for your organizational plan. This gives you a flexible basis for personnel planning, previewing, and reporting.

data structures for assignment

Organizational units, jobs, positions, cost centers, and persons are examples of objects that you use as part of Organizational Management.

Once you have created a structure using objects and relationships, you can assign additional characteristics to the objects.

data structures for assignment

Objects consist of three parts:

  • Object Infotype – includes the ID number, a short and long text, and the validity period.
  • Relationship Infotype – contains the relationship(s) between this and other objects.
  • Other Infotypes – contains the object characteristics, for example Account Assignment.

All data for an object (existence, relationships, and additional characteristics) is created in the form of infotypes. You define particular characteristics for an object in each infotype.

Not all infotypes are absolutely necessary. Some infotypes can be maintained for all object types, for example, the object and relationship infotypes. Others are only relevant for particular object types, such as the vacancy infotype, which is only relevant for positions.

Organizational Plan

An organizational plan is described by the following characteristics:

  • A comprehensive and dynamic model of the structural and personnel environment in your enterprise.
  • Created using organizational units and positions.
  • Depicts the organizational structure of a company.
  • Depicts the individual positions and the reporting structure in the organization.

data structures for assignment

You portray hierarchies within your organizational plan. The organizational structure shows the hierarchy that exists between organizational units. This structure is created by creating and maintaining organizational units and relating them to one another. Your organizational plan maps the line structure and the reporting structure (chain of command) of your enterprise.

An organizational plan can include the following objects:

  • Root and subordinate organizational units.
  • Jobs – templates for positions.
  • Cost centers.

Organizational Units

Organizational units describe the various business units that exist in your company.

data structures for assignment

Organizational units can be classified generally (for example, by function or by region) or specifically (for example, by project group). The way in which organizational units for your company are classified depends on how your company is structured.

You must relate organizational units with one another in an organizational plan. The hierarchical relationships that exist between the organizational units represent the organizational structure of your enterprise. Organizational units can be linked to cost centers from Accounting.

Job Templates

Jobs are general classifications of tasks that are performed by employees. Each job represents a unique classification of responsibilities in your organization. When you create jobs templates, you should consider what specific tasks and requirements are associated with individual positions.

data structures for assignment

The job template is used to define the specific positions, with the positions inheriting the job characteristics.

Jobs can be used in other HR components, for example:

  • Shift Planning
  • Personnel Cost Planning
  • Personnel Development

Positions are individual employee assignments. Positions are held by employees.

data structures for assignment

Multiple positions can be based on one job. After you create a job, you must specify the number of corresponding positions required in the company. A position inherits a job's tasks; however, you can also define additional tasks that refer specifically to one position. Positions can be completely filled, partially filled, or vacant.

As a rule, each position represents one employee. However, a position can be occupied by more than one employee, each working less than full time. For example, one employee can hold 60% and another employee 40% of a position.

Cost Centers

Cost centers can be assigned to organizational units and positions. Cost center assignments are inherited along the organizational structure and are maintained in Controlling. This hierarchical inheritance can be interrupted, when cost centers are assigned to organizational units or positions at lower levels.

data structures for assignment

Persons are objects that hold positions within the organizational structure. Persons generally represent employees in your company.

data structures for assignment

Infotypes for persons are maintained in Personnel Administration and are linked to an organizational plan through their position assignment, for example IT0001 Organizational Assignment, IT0002 Personal Data, IT0007 Planned Working Time.

Log in to track your progress & complete quizzes

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 18 January 2024

KARR-seq reveals cellular higher-order RNA structures and RNA–RNA interactions

  • Tong Wu   ORCID: orcid.org/0000-0001-9968-9909 1 , 2   na1 ,
  • Anthony Youzhi Cheng 3 , 4   na1   nAff7 ,
  • Yuexiu Zhang 5 ,
  • Jiayu Xu 5 ,
  • Jinjun Wu   ORCID: orcid.org/0000-0002-2379-1162 6 ,
  • Xiao Li 5 ,
  • Bei Liu 1 , 2 ,
  • Xiaoyang Dou 1 , 2 ,
  • Pingluan Wang 1 , 2 ,
  • Linda Zhang 1 , 2 ,
  • Jingyi Fei   ORCID: orcid.org/0000-0002-9775-3820 6 ,
  • Jianrong Li 5 ,
  • Zhengqing Ouyang   ORCID: orcid.org/0000-0003-2842-8503 4 &
  • Chuan He   ORCID: orcid.org/0000-0003-4319-7424 1 , 2 , 6  

Nature Biotechnology ( 2024 ) Cite this article

1597 Accesses

41 Altmetric

Metrics details

  • RNA metabolism
  • RNA sequencing

RNA fate and function are affected by their structures and interactomes. However, how RNA and RNA-binding proteins (RBPs) assemble into higher-order structures and how RNA molecules may interact with each other to facilitate functions remain largely unknown. Here we present KARR-seq, which uses N 3 -kethoxal labeling and multifunctional chemical crosslinkers to covalently trap and determine RNA–RNA interactions and higher-order RNA structures inside cells, independent of local protein binding to RNA. KARR-seq depicts higher-order RNA structure and detects widespread intermolecular RNA–RNA interactions with high sensitivity and accuracy. Using KARR-seq, we show that translation represses mRNA compaction under native and stress conditions. We determined the higher-order RNA structures of respiratory syncytial virus (RSV) and vesicular stomatitis virus (VSV) and identified RNA–RNA interactions between the viruses and the host RNAs that potentially regulate viral replication.

RNA lies in the center of gene expression regulation primarily through its interactions with other biomacromolecules, such as RNA-binding proteins (RBPs), DNA and other RNA species. Pioneered by approaches involving crosslinking and immunoprecipitation (CLIP) 1 , various methods have been developed to study RNA–RBP interactions, which have markedly advanced RNA biology. However, how RNAs interact with other molecules and the subsequent functional consequences remain inadequately studied. The functions of RNAs are usually determined by their higher-order structures 2 , which include the assembly and organization of multiple RNA secondary structural elements as well as three-dimensional (3D) RNA conformations mediated by other biomacromolecules. However, these structures are difficult to determine due to their highly dynamic nature, in particular when responding to perturbations.

High-throughput sequencing-based approaches have been applied to reveal RNA–RNA interactions transcriptome wide. Intramolecular RNA–RNA interactions are often critical to the functional relevance of higher-order RNA structures, and intermolecular interactions may reflect distinct RNA functions. Psoralen-based methods, such as PARIS, RAP-RNA, LIGR-seq, SPLASH and COMRADES 3 , 4 , 5 , 6 , 7 , directly capture RNA duplexes and were applied to study the impact of pairwise RNA interactions on RNA metabolism. These methods predominantly capture base-pairing interactions but may miss the distance information between spatially proximal single-stranded RNA fragments. Protein-mediated approaches, including CLASH, RPL, MARIO and RIPPLiT 8 , 9 , 10 , 11 , complement psoralen crosslinking by revealing physical distances between RNAs, but they usually exhibit limited sensitivity for transcripts with modest expression levels and mostly capture RNAs bound by particular proteins. RIC-seq improved the sensitivity by performing protein-mediated proximity ligation in cells and incorporating biotinylated nucleotides during the ligation step 12 . However, because the local protein concentration and the strength of protein–RNA association are spatially heterogeneous 13 , RNA regions with weak protein–RNA engagement could be underrepresented. Therefore, it is desirable to develop technologies that do not rely solely on protein–RNA crosslinking to study higher-order RNA structures and RNA–RNA interactions.

Here we describe kethoxal-assisted RNA–RNA interaction sequencing (KARR-seq), which takes advantage of N 3 -kethoxal-mediated RNA labeling 14 and dendrimer-based nucleic acid conformation capture 15 . N 3 -kethoxal has been demonstrated to efficiently label RNA with azide groups in live cells, serving as a bio-orthogonal handle for further functionalization 14 . Meanwhile, PAMAM dendrimers modified by multiple dibenzocyclooctane (DBCO) moieties can readily crosslink proximal RNA transcripts by reacting with the labeled azide groups. Because KARR-seq uses chemical crosslinkers to capture spatially proximal transcripts, RNA–RNA interactions detected by KARR-seq are not determined solely by local protein concentrations or RBP–RNA affinities. The size of crosslinkers can be tuned to capture proximity at different distances. KARR-seq also enables the enrichment of crosslinked products, increasing the sensitivity for transcripts with relatively low abundance.

KARR-seq accurately reveals RNA tertiary structures and identifies intermolecular RNA–RNA interactions. Using KARR-seq, we show that cytoplasmic mRNA has less compact structures than their nuclear counterparts, with translation resolving higher-order RNA structures under native and stress conditions. We detected RNA–RNA interactions that affect pre-rRNA processing. Furthermore, we mapped the tertiary structures of respiratory syncytial virus (RSV) and vesicular stomatitis virus (VSV) RNAs and detected hundreds of interactions between viral and host RNAs. Host mRNAs that interact with RSV and VSV enrich different molecular pathways. The blockage of RNA–RNA interactions between RSV RNA and host mRNAs represses RSV replication. KARR-seq thus enables precise and sensitive mapping of RNA–RNA interactions and RNA structurome to reveal RNA functions.

Development of KARR-seq

Effective capture of spatially proximal RNAs has been challenging primarily owing to modest RNA crosslinking efficiency by limited available RNA crosslinkers 16 . Click chemistry reactions happen fast and quantitatively in cells under ambient conditions, enabling the detection of binding landscapes of many biomacromolecules 17 . However, these reactions have not been applied in RNA proximity studies due to a lack of ‘clickable’ functional groups on cellular RNA. In KARR-seq, we applied N 3 -kethoxal, a cell-permeable and nucleus-permeable small molecule that efficiently functionalizes RNA with an azide tag 14 , 18 . We also decorated commercially available dendrimers with multiple DBCO and biotin moieties (Supplementary Fig. 1a,b ), with DBCO reacting with proximal N 3 -kethoxal-modified RNA via ‘click’ reactions and biotin enabling enrichment of the crosslinked products.

We first labeled fixed and permeabilized murine embryonic stem cells (mESCs) with N 3 -kethoxal and then diffuse modified dendrimer G3 at 37 °C to initiate the ‘click’ reaction (Fig. 1a ). Gel electrophoresis (Supplementary Fig. 1c ) and dot blot (Supplementary Fig. 1d ) of the purified RNA confirmed successful RNA crosslinking. Control experiments performed in the absence of N 3 -kethoxal or dendrimer showed very weak or invisible signals in dot blot (Supplementary Fig. 1d ). RNA from crosslinked cells was then fragmented and applied for pull-down using streptavidin-coated beads. On-bead end repair and proximity ligation were subsequently performed, and the post-ligation RNA was amplified for pair-ended sequencing for roughly 100 million reads per sample (Fig. 1a ).

figure 1

a , KARR-seq workflow. Cells are first treated with N 3 -kethoxal to modify RNAs with azide tags (red), which enables crosslinking of the tagged RNA molecules by DBCO-decorated dendrimers (blue). Biotin modifications (pink) on dendrimers facilitate the enrichment of crosslinking products, followed by proximity ligation, RNA library construction and sequencing. Chimeric sequencing reads are aligned to identify RNA–RNA interactions. b , Physical distances between interacting fragments of TERC in K562 cells, measured by KARR-seq data generated using G1 and G7 dendrimers, respectively. The physical distances were measured using the cryo-EM structure of TERC . The actual physical distance distribution in the cryo-EM structure is shown in blue for comparison. c , Illustration of loop and stripe structures detected by KARR-seq. In arc groups, loops, left stripes and right stripes are denoted in blue, yellow and pink, respectively. Corresponding KARR-seq chimeric reads are displayed below. d , The KARR-seq interaction maps and arc groups for the Eef1g ( EEF1G ) transcript in mESCs (left) and HepG2 cells (right). e , The simulated physical distance map of the human EEF1G transcript. For b – d , KARR-seq was performed in two biological replicates.

De-duplicated KARR-seq chimeric reads constituted around 6% of all reads and recapitulate ribosomal RNA (rRNA) higher-order structures (Supplementary Fig. 1e ). In negative controls performed in the absence of N 3 -kethoxal, chimeric reads constitute only 0.26% of sequencing reads. RNA contact maps generated from individual KARR-seq replicates show high correlations (Supplementary Fig. 1f ). Because N 3 -kethoxal specifically reacts with guanines 14 , we evaluated the effect of nucleotide content on the frequency of KARR-seq chimeric reads. Using data produced by G1 dendrimers, we analyzed transcripts with more than 250 chimeric reads and grouped all base positions according to their chimeric reads coverage. We found that guanine is modestly enriched among bases with high chimeric reads coverage (Supplementary Fig. 1g ). We then performed KARR-seq with a 1:1 mixture of human (K562) and Drosophila (S2) cells. In this case, chimeric reads constitute 8.4% of total reads when mapped to the reference genomes. Interspecies chimeric reads account for 7.2% of chimeric reads (Supplementary Fig. 2a ) and 0.61% of all sequencing reads, with no significant interaction detected between human and Drosophila RNAs (Supplementary Fig. 2b ). The percentage of interspecies chimeric reads for KARR-seq (0.61%) is similar to that in RIC-seq (0.6%) 12 . Note that, in RIC-seq, the proximity ligation reaction was performed in fixed cells instead of free solutions, and, therefore, the interspecies ligation frequency is expected to be low.

We next analyzed KARR-seq data produced by dendrimers with different diameters, namely G1 (22 Å), G3 (36 Å), G5 (54 Å) and G7 (81 Å), in mESCs and K562 cells. We projected KARR-seq chimeric read positions onto the 3D cryogenic electron microscopy (cryo-EM) structures of TERC and U1 and calculated the spatial distances between the two RNA fragments from each chimeric read. For both transcripts, G7 captures a larger median distance than G1 does (Fig. 1b and Supplementary Fig. 3a ). Dendrimers with similar sizes detect similar transcriptome-wide RNA–RNA interaction landscapes (Supplementary Fig. 3b,c ). The choice of dendrimers did not affect ligation efficiency (Supplementary Fig. 3d ), but G1 and G3 captured twice the amount of transcripts with valid interactions as G7 did (Supplementary Fig. 3e ), potentially because G1 and G3 are more accessible to compact ribonucleoprotein (RNP) complexes 19 . We, therefore, used G1 for KARR-seq experiments unless otherwise noted.

KARR-seq maps higher-order RNA structures

KARR-seq chimeric reads reveal both intra-molecular and inter-molecular RNA–RNA interactions (Supplementary Fig. 3c ). We first evaluated the behavior of KARR-seq in depicting higher-order RNA structures in HepG2 cells, K562 cells and mESCs. We plotted the KARR-seq interaction map of human 18S rRNA from K562 cells and found that KARR-seq contact frequency recapitulates main features of the 18S rRNA physical distances map revealed by cryo-EM 20 (Supplementary Fig. 4a ). The distribution of physical distance revealed by KARR-seq overlaps decently with the actual distribution revealed by cryo-EM (Supplementary Fig. 4b ). In comparison, RIC-seq and PARIS enrich interactions within short physical distances (Supplementary Fig. 4b ), suggesting that KARR-seq may capture RNA proximity in a broader spatial distance range.

Compared to rRNA, mapping mRNA tertiary structures is particularly challenging owing to their dynamic nature and the relatively low abundance of mRNAs. KARR-seq detects mRNA loops and stripes, where loops stand for relatively stable duplex interactions and stripes represent more dynamic contacts or RNA–RNA proximity without direct pairing (Fig. 1c and Methods ). We simulated RNA tertiary structures based on the freely jointed chain (FJC) model. Benchmarking with the cryo-EM structure using RPPH1 resulted in a Pearson’s correlation coefficient of 0.696 between the actual and simulated physical distance maps (Supplementary Fig. 4c ), indicating decent accuracy of the simulation. KARR-seq interaction frequency maps of mRNAs match the corresponding simulated physical distance maps (Fig. 1d,e ), demonstrating the capability of KARR-seq in mRNA tertiary structure depiction. Interaction maps for homologous transcripts in HepG2 cells and mESCs reveal similar topologies (Fig. 1d ), suggesting conserved RNA tertiary structure in different species.

Benchmarking KARR-seq with RIC-seq and PARIS

We systematically compared KARR-seq data (HEK293T, K562 and HepG2 cells) and results from PARIS (HEK293T cells) 4 and RIC-seq (HeLa cells) 12 ; PARIS and RIC-seq represent the state-of-the-art methods in mapping RNA duplexes and protein-mediated RNA proximity, respectively. KARR-seq exhibits a stronger correlation with RIC-seq than with PARIS (Fig. 2a ), because both KARR-seq and RIC-seq detect spatially proximal RNA, whereas psoralen primarily reacts with duplexes. We next analyzed the minimal free energy (MFE) of the intramolecular RNA–RNA interactions detected by KARR-seq, PARIS and RIC-seq on TERC and U1 and divided all interactions into three categories depending on whether they match with cryo-EM structures. Interactions detected by PARIS mostly correspond to known secondary structures and tend to have low mean MFE, whereas interactions detected by KARR-seq and RIC-seq include more spatially proximal non-duplex contacts (denoted as tertiary; Fig. 2b ) and interactions that are not revealed by cryo-EM (denoted as novel; Fig. 2b ). All three methods share a similar ratio of chimeric reads (5–7%) when data were mapped to the human genome (Supplementary Fig. 5a ). However, the percentage of chimeric reads dropped to only around 1% when RIC-seq data were mapped to the transcriptome (Supplementary Fig. 5a ), indicating that RIC-seq chimeric reads enrich pre-mRNA. Transcripts detected by RIC-seq exhibit lower expression levels (Supplementary Fig. 5b ) and are longer (Supplementary Fig. 5c ) than those detected by KARR-seq and PARIS. Concordantly, KARR-seq and PARIS share a similar RNA–RNA interaction landscape between different RNA categories, whereas the majority (56%) of interactions identified by RIC-seq are intron mediated (Fig. 2d ), further suggesting that RIC-seq enriches interactions in the cell nucleus.

figure 2

a , Average Pearson correlation between the interaction maps of KARR-seq (K562 and HEK293T cells), RIC-seq (HeLa cells) and PARIS (HEK293T cells). b , MFE for RNA–RNA interactions detected using KARR-seq, PARIS and RIC-seq within TERC and U1 transcripts, respectively. Interactions were grouped into ‘secondary’, ‘tertiary’ and ‘novel’. ‘Secondary’ refers to the interactions that match secondary structure prediction. ‘Tertiary’ refers to spatially proximal RNA regions revealed by the cryo-EM structure that do not correspond to secondary structures. ‘Novel’ refers to interactions that are not supported by secondary structures or cryo-EM structures. c , Circos plots showing the RNA–RNA interaction landscape revealed by KARR-seq, PARIS and RIC-seq. The width of the link between two RNA categories indicates the relative abundance of chimeric reads taken by interactions between these two categories. d , Left, the physical distance map of TERC revealed by the cryo-EM structure of TERC . Right, higher-order structures of TERC detected by KARR-seq, PARIS and RIC-seq under the same sequencing depth. The blue dots denote base-pairing secondary structures acquired from the Rfam annotations (RF00024). e , The ROC–AUC curves for KARR-seq, RIC-seq and PARIS for detecting higher-order structures of TERC , 18S, 28S and U3. The dashed lines denote random classifiers. RIC-seq and PARIS data were acquired from the Gene Expression Omnibus (RIC-seq: GSE127188 ; PARIS: GSE74353 ). Cryo-EM structures were acquired from the Protein Data Bank (accession codes: 7QXB for TERC , 6QX9 for U3 and 4V6X for 18S and 28S). KARR-seq was performed in two biological replicates.

We then performed differential analysis of RNA–RNA interactions detected by KARR-seq and RIC-seq. Interactions uniquely detected by KARR-seq are mostly stripes, whereas RIC-seq-specific interactions are mostly loops (Supplementary Fig. 5d ), suggesting that KARR-seq could detect more transient and dynamic contacts. Around half of KARR-seq-specific intramolecular interactions on mRNAs are located at coding sequences (CDS), whereas 84% of RIC-seq-specific intramolecular interactions on mRNAs are at the 3′ untranslated region (UTR) (Supplementary Fig. 5e ), which could be due to the binding preference of specific RBPs 21 .

We next benchmarked KARR-seq, PARIS and RIC-seq using transcripts with published cryo-EM structures, including 18S, 28S, U3 and TERC . KARR-seq detects pervasive intramolecular RNA–RNA interactions and recapitulates the physical distance maps (Fig. 2d and Supplementary Fig. 6a ). Quantitatively, receiver operating characteristic (ROC) analysis revealed higher or similar area under the curve (AUC) numbers of KARR-seq than the other two methods (Fig. 2e ). Note that KARR-seq, PARIS and RIC-seq datasets were generated from different cell lines and were sequenced to different depths, which could complicate direct comparisons.

Because psoralen crosslinks double-stranded RNA, PARIS data show segment-like patterns that represent RNA duplexes on the contact maps. In comparison, kethoxal predominantly reacts with single-stranded regions. Therefore, KARR-seq data present triangular domain-like structures (Fig. 2d and Supplementary Fig. 6b,c ). Notably, ‘KARR-seq domains’ cover similar nucleotide regions as ‘PARIS segments’ do (Fig. 2d and Supplementary Fig. 6b,c ), suggesting that PARIS and KARR-seq reveal different possible RNA conformations at the same RNA region to complement each other.

Distinct mRNA higher-order structures in the cell nucleus

RNA secondary structures are dynamic when RNAs transit from the nucleus to cytoplasm 22 . However, higher-order RNA structures among different cellular compartments have not been characterized. We purified K562 nuclei (Supplementary Fig. 7a ) for KARR-seq, with dot blot showing a comparable RNA-labeling efficiency in purified nuclei and intact K562 cells (Supplementary Fig. 7b ). KARR-seq data from the nuclear fraction differ evidently from that using intact cells (Supplementary Fig. 7c ). Transcripts with valid higher-order structures detected in the nucleus are longer and show lower expression levels than those detected in intact cells (Supplementary Fig. 7d,e ). Notably, transcripts detected by KARR-seq using cell nuclei and transcripts detected by RIC-seq using intact cells display similar length and expression level (Supplementary Fig. 7d,e ). These results suggest distinct higher-order RNA structure landscapes between the nucleus and cytoplasm and corroborate that RIC-seq enriches nuclear RNA–RNA interactions.

We further investigated the higher-order structure differences between nuclear and cytoplasmic RNA in K562 cells. We also performed KARR-seq in vitro using purified and refolded K562 total RNA, which reveals the intrinsic ability of RNA polynucleotide chain to fold in the absence of cellular factors. RNA contact frequency decreases log-linearly to the coordinate distance in all tested conditions. The slope of the trend lines, defined as beta coefficients ( β ), is −2.06 and −1.63 for mRNA in intact cells and the nuclei, respectively, with more long-range contacts detected in the nuclear fraction (Supplementary Fig. 8a ). We calculated the beta coefficient for each individual transcript. RNAs within the nuclei tend to have higher beta coefficient compared to same transcripts from intact cells, shown as a transcriptome-wide distribution (Supplementary Fig. 8b ) or scrutinized at the level of individual transcripts (Supplementary Fig. 8c–e ).

In the KARR-seq protocol, cells are pre-fixed using 1% formaldehyde before N 3 -kethoxal treatment. We assayed the effect of formaldehyde crosslinking by performing KARR-seq using weakly fixed (0.1% formaldehyde) and unfixed K562 cells. Under 1% and 0.1% formaldehyde conditions, RNA–RNA interactions detected by KARR-seq are largely similar both at the transcript level (Supplementary Fig. 9a,b ) and transcriptome wide (Supplementary Fig. 9c,d ). However, interaction maps generated from the unfixed cells resemble those for refolded RNA (Supplementary Fig. 9a,b ). Quantitatively, we observed higher beta coefficients across the transcriptome along with more long-range interactions using unfixed cells (Supplementary Fig. 9c,d ). We speculate that weakly bound RBPs and ribosomes could fall off from RNA during the labeling steps in the absence of formaldehyde, which could lead to partial RNA refolding. Formaldehyde crosslinking is likely necessary to capture bona fide cellular RNA conformations using KARR-seq.

To quantify the extent of RNA folding across transcripts with varying lengths and abundance, we devised the folding index, an exponential decay transformation of the genomic distance between the two arms of each chimeric read 23 ( Methods ). The folding index describes the relative genomic distance of two interacting RNA fragments, with a high folding index reflecting RNA–RNA interaction and/or proximity detected between two distant RNA fragments, or an RNA is more extensively folded (Supplementary Fig. 10a,b ). To calculate the folding index of a given transcript (or region), we computed the mean value of folding indexes for all chimeric reads mapped to this transcript (or region) as the transcript-level (or region-level) folding index. For transcriptome-wide comparisons, we plotted the distribution of folding index for all chimeric reads or the distribution of all transcript-level folding indexes. Sequencing depth differences and changes in the abundance of certain transcripts show minimal effects on the comparison of folding index between different conditions (Supplementary Fig. 10c–e ). The median folding index for nuclear mRNA is 0.51, which is largely higher than that for the total cellular mRNA (median: 0.37; Supplementary Fig. 10f ), confirming more extensive folding of nuclear RNA.

Certain RBPs are associated with RNA–RNA interactions

RNA secondary structures have been demonstrated to drive RBP binding 24 , but the relationship between higher-order RNA structures and RBPs has yet to be fully explored. We first examined the association between RBP and RNA–RNA interactions by measuring RBP density at interaction regions using large-scale eCLIP data from ENCODE 25 . A larger number of RBP eCLIP peaks was observed on RNA–RNA interaction regions (identified using refolded RNA) than on shuffled regions in HepG2 and K562 cells (Supplementary Fig. 11a,b ). We next quantified the association between individual RBP and RNA–RNA interactions by applying a multiple linear regression to correlate eCLIP reads density of each RBP to region-level folding index values throughout the K562 transcriptome. A small set of RBPs, including LIN28B, SRSF1, FXR1, FXR2, FMR1, SND1, METAP2, BUD13, ZNF622, UPF1 and YBX3, was identified to be positively correlated with RNA–RNA interactions (Supplementary Fig. 11c ). The low correlation coefficients suggest a weak association, indicating that each RBP binds only to a small portion of RNA–RNA interaction regions.

To validate this observation, we performed KARR-seq in K562 cells after YBX3 or SRSF1 knockdown (Supplementary Fig. 11d,e ). Knockdown of YBX3 or SRSF1 did not lead to obvious higher-order structure variations on their target RNAs (Supplementary Fig. 11f,g ) nor transcriptome-wide changes of the folding index (Supplementary Fig. 11h ). Therefore, each RBP seems to be associated with or regulate only a small number of RNA–RNA interactions, potentially due to their binding preferences to specific sequence and secondary structure motifs.

Translation suppresses mRNA long-range interactions

We next reasoned that ribosome translocation during translation could remodel higher-order RNA structures by resolving RNA duplexes. To test this hypothesis, we calculated the folding index difference between in vitro and in vivo conditions for each transcript. We observed a positive correlation between the folding index difference and ribosome occupancy density, suggesting that higher translation efficiency could lead to a larger difference in RNA folding between in vitro and cellular conditions (Fig. 3a ). We then performed KARR-seq after treating HepG2 cells with translation inhibitors, harringtonine and cycloheximide. KARR-seq arc groups and interaction maps showed more intramolecular contacts upon translation inhibition (Fig. 3b and Supplementary Fig. 12a ). Most harringtonine-induced and cycloheximide-induced interactions are located at CDS of mRNA (Fig. 3c ), the region where ribosome translocation occurs during translation.

figure 3

a , The effect of ribosome binding on RNA–RNA interactions in HepG2 cells. The x axis denotes ribosome binding strength, and the y axis shows the folding index difference between in vitro and in vivo. b , KARR-seq arc groups for the NCL transcript in control and harringtonine-treated HepG2 cells. Folding index: 0.246 for control and 0.290 for harringtonine. c , Metagene plot showing the relative abundance of intermolecular mRNA interactions under denoted conditions. CHX, cycloheximide; HT, harringtonine. d , The transcriptome-wide distribution of beta coefficients under denoted conditions. *** indicates P  < 0.001. e , Folding index for mRNA and lncRNA in control and harringtonine-treated HepG2 cells. f , The length of transcripts that exhibit upregulated and downregulated intramolecular interactions after arsenite treatment in K562 cells. g , The 5′ UTR, CDS and 3′ UTR length for mRNAs that exhibit upregulated and downregulated intramolecular interactions after arsenite treatment in K562 cells. h , The translation efficiency under the normal condition for mRNAs that exhibit upregulated and downregulated intramolecular interactions after arsenite treatment in K562 cells. In f – h , for the analysis of all transcripts, n  = 104 transcripts for the downregulated group and n  = 73 transcripts for the upregulated group. For the analysis of mRNAs, n  = 102 transcripts for the downregulated group and n  = 68 transcripts for the upregulated group. i , mRNA folding index in control, arsenite-treated and harringtonine-treated K562 cells and purified K562 nuclei. n refers to the number of chimeric read level folding index. n  = 440,484 for whole cell control, n  = 242,268 for whole cell arsenite, n  = 251,601 for whole cell HT, n  = 154,797 for nuclear control and n  = 162,507 for nuclear arsenite. j , mRNA folding index for SG-localized transcripts and other (non-SG) transcripts in control and arsenite-treated K562 cells. n  = 161 transcripts in the non-SG group and n  = 215 transcripts in the SG group. For f – h , P values were calculated by the one-sided Mann–Whitney test. For e , i , j , P values were calculated by the two-sided Mann–Whitney test. In box plots shown in e – h , the lower and the upper bounds denote 25th and 75th percentiles, respectively. The minima denote the lower bound −1.5× IQR. The maxima denote the upper bound +1.5× IQR. KARR-seq was performed in two biological replicates. IQR, interquartile range.

Consistently, inhibitor treatments right-shifted the distribution of RNA beta coefficients (Fig. 3d ) and increased the averaged mRNA folding index (Fig. 3e ). Meanwhile, translation inhibition did not alter the folding index for long non-coding RNAs (lncRNAs) (Fig. 3e ), in line with the absence of translation machinery acting on these transcripts. These results collectively suggest that translation could resolve mRNA intramolecular interactions, leading to more stretched mRNA conformations. To validate this effect, we employed fluorescence in situ hybridization (FISH) imaging to measure the spatial distance between mRNA 5′ ends and 3′ ends under control and translation inhibition conditions. The 5′–3′ distances were around 100 nm for all three tested transcripts (Supplementary Fig. 12c ), consistent with findings from previous single-molecule studies 26 . Translation inhibition resulted in shorter 5′–3′ distances in all three cases (Supplementary Fig. 12c ), confirming that translation could indeed contribute to less extensive mRNA folding.

mRNA was proposed to form a ‘closed-loop’ structure mediated by translation initiation and termination complexes 27 . However, recent studies suggested a revised model of mRNA 5′–3′ communication 26 , 28 , 29 . Using KARR-seq data, we detected modest amounts of chimeric reads between mRNA 5′ and 3′ ends (Supplementary Fig. 12d ). Intriguingly, 5′–3′ interactions were not more prevalent than cis -mRNA interactions at other regions (Supplementary Fig. 12e ), and translation inhibition increased the proportion of 5′–3′ chimeric reads among all chimeric reads (Supplementary Fig. 12d ). Consistent with recent reports 26 , 28 , our data suggest that mRNA loops are not completely closed, and translation inhibition may facilitate closer proximity between mRNA ends. Because mRNA loops are proposed to be mediated by large protein complexes, it is also possible that the sizes of dendrimers are not sufficient to capture all these interactions.

Remodeling of higher-order RNA structures under arsenite stress

Extracellular stresses could remodel RNA homeostasis and induce stress granules (SGs) 30 ; nonetheless, how stresses affect higher-order RNA structures and how RNA interactomes contribute to SG assembly remain to be investigated 31 . We conducted KARR-seq using K562 cells treated by sodium arsenite that induces oxidative stresses and SG formation. We performed differential RNA–RNA interaction analysis between normal and arsenite conditions (Supplementary Data 1 ). After arsenite treatment, transcripts bearing upregulated intramolecular interactions are much longer than those bearing fewer intramolecular interactions (Fig. 3f ). mRNAs bearing upregulated intramolecular interactions after arsenite treatment possess longer 3′ UTR, CDS and 5′ UTR (Fig. 3g ). These mRNAs also show a higher translation efficiency under the normal condition (Fig. 3h ). Collectively, these results suggest the importance of RNA length and mRNA translation efficiency in remodeling higher-order RNA structurome under stress conditions.

Transcriptome-wide analysis revealed that arsenite treatment increased the mRNA folding index to a level seen in harringtonine-treated cells (Fig. 3i ), indicative of more long-range RNA–RNA interactions and more extensive mRNA folding. Because arsenite is known to induce translation repression, this observation corroborates the suppressive effect of translation to higher-order RNA structures in a real physiological context. Interestingly, arsenite treatment decreased mRNA folding index in the nucleus (Fig. 3i ), which might be attributed to the change of RBP composition in the nucleus or other secondary effects.

We next categorized mRNAs into two groups based on their localizations revealed by published SG transcriptomics data 24 , 32 . SG-localized transcripts show a lower averaged folding index than those that are not localized to SG (referred to as non-SG; Fig. 3j ) under normal and arsenite conditions. Non-SG transcripts demonstrate an increased folding index after arsenite treatment, whereas the folding index of SG-localized transcripts remains unchanged (Fig. 3j ). We envision that less extensive mRNA folding enhances the accessibility of mRNAs to interact with RBPs and other RNA molecules, which is crucial to the assembly of multi-component messenger RNP (mRNP) complexes within SGs. Indeed, the binding targets of SG marker proteins G3BP1 and TIA1 (refs. 33 , 34 , 35 ) showed slightly lower folding indexes under both conditions (Supplementary Fig. 12f ). However, the small difference in folding indexes between G3BP1 targets and other transcripts suggests the existence of additional factors that determine RNA composition within SGs.

KARR-seq identifies functional intermolecular RNA–RNA interactions

We next analyzed the intermolecular chimeric reads to identify RNA–RNA interactions between different RNA molecules. KARR-seq detects intermolecular interaction between various RNA categories in both intact cells and cell nuclei, such as lncRNA–mRNA, snoRNA–mRNA, snoRNA–rRNA and mRNA–mRNA interactions, with mRNA-mediated interactions taking the largest portion (Fig. 4a ). KARR-seq and PARIS data reveal similar intermolecular interaction landscapes (Supplementary Fig. 13a,b ), whereas snRNA, snoRNA and some other non-coding RNA species are depleted in RIC-seq data (Supplementary Fig. 13c ).

figure 4

a , The landscape of intermolecular RNA–RNA interactions revealed by KARR-seq in K562 cells (left) and K562 nucleus (right). The width of the link between two RNA categories denotes the relative abundance of chimeric reads taken by interactions between these two categories. mRNA–rRNA interactions, which are primarily a result of translation, were excluded from the plots. b , Interactions between C/D box snoRNA and 18S in K562 cells. Previously identified interaction sites are shown in pink. Interaction sites identified by KARR-seq are shown in green. c , Snapshots of KARR-seq data revealing SNORD25 –18S and SNORD65 –18S interactions. Regions colored in green denote identified interaction regions. The dashed lines denote previously known 2′ OMe modification sites. d , Scheme showing the organization and processing of human pre-rRNA 5′ ETS. e , Top, KARR-seq reads density for interactions between U3 and 5′ ETS in K562 cells (human) and mESCs (mouse). Bottom, KARR-seq interaction maps showing the higher-order structures of 5′ ETS in the corresponding cell lines. Stem loops are enclosed in black squares. f , Relative pre-rRNA levels in K562 cells treated by ASO that blocks a U3 interaction site at the 5′ ETS. Two sets of primers amplifying A′ and A0-proximal regions were applied for qPCR, respectively. Data are mean ± s.d. P values were calculated by Student’s t -test. n  = 3 biological replicates. KARR-seq was performed in two biological replicates.

To assess the sensitivity of KARR-seq in detecting functional intermolecular interactions, we enumerated interactions between C/D box snoRNA and rRNA in HepG2 cells and overlapped these interactions with previously identified rRNA 2′- O -methylation (2′-OMe) sites from the snoRNA database (snoRNA-LBME-db) 36 . snoRNA–rRNA interactions detected by KARR-seq overlap with 80% (12/15) of known modification sites on 18S (Fig. 4b,c ) and 85% (34/40) on 28S (Supplementary Fig. 13d,e ), suggesting a high sensitivity of KARR-seq. We also identified eight uncharacterized snoRNA–rRNA interactions on 18S (Fig. 4b,c ) and 74 on 28S (Supplementary Fig. 13d,e ). We experimentally validated these snoRNA–18S interactions by applying biotinylated antisense oligos (ASOs) that target regions adjacent to these identified interaction sites to pull down specific 18S rRNA fragments. A random biotinylated ASO was used as a control. The input and pull-down samples were subsequently subjected to reverse transcription–quantitative polymerase chain reaction (RT–qPCR) to amplify the corresponding snoRNAs. In four out of the six tested interactions, we observed enrichment of snoRNAs in the 18S pull-down samples compared to the control pull-down, suggesting the validity of these interactions (Supplementary Fig. 13f ). These interactions may correspond to potential 2′-OMe sites that are not yet documented or suggest snoRNA functions beyond 2′-OMe deposition. The other two interactions, SNORD65 –18S and SNORD12C –18S, could be false positives (Supplementary Fig. 13f ).

KARR-seq detects RNA–RNA interactions that affect pre-rRNA processing

rRNA maturation involves stepwise spacer cleavage from the polycistronic 45S pre-rRNA, which relies on extensive interactions between pre-rRNA and U3 RNA 37 . Several U3 binding sites on the 5′ external transcribed spacer (ETS) have been identified in bacteria and lower eukaryotes biochemically 38 , 39 , 40 , 41 , but the landscape of U3–pre-rRNA interactions within mammalian cells and how these interactions influence the 5′ ETS structure are unclear.

The 5′ ETS of mammalian pre-rRNA harbors three closely located cleavage sites, namely A′, A0 and 1 (Fig. 4d ) 37 . KARR-seq revealed extensive interactions between U3 and the A′-A0 region in both K562 cells and mESCs, whereas minimal interactions were detected at the A0-1 region (Fig. 4e , top). In the meantime, A′-A0 and A0-1 regions show distinct higher-order structure features: A′-A0 forms highly dynamic stripe and domain structures, whereas A0-1 includes an array of stable stems (Fig. 4e , bottom). The strength of U3–rRNA interaction tends to decrease as intramolecular 5′ ETS interactions become pronounced (Fig. 4e ). Therefore, we propose that U3 RNA may regulate 5′ ETS processing by maintaining the correct conformation of A′-A0 through direct U3–pre-rRNA interactions. The A0-1 region is less involved in U3-mediated interactions because its stem structures are relatively stable and less susceptible to conformational changes. ASO ( Methods ) that blocks U3–ETS interactions at the A′-A0 region increased pre-rRNA level in HepG2 cells compared to the control ASO (Fig. 4f ), suggesting the importance of U3–ETS interaction in regulating rRNA biogenesis in mammalian cells.

KARR-seq detects RNA–RNA interactions in virus-infected cells

Many viruses use RNA to store genetic information. These viruses have evolved extensively to regulate their life cycle through RNA structure-based mechanisms and can efficiently harness host cellular machineries 42 , 43 . The higher-order structures of most viral genomes and RNA–RNA interactions between virus and host are largely unexplored. In light of this, we applied KARR-seq to A549 cells infected by human RSVs and VSVs, respectively. RSV is a prominent cause of respiratory tract infection in infants, children, the elderly and immunocompromised individuals 44 , whereas VSV has been used for decades as a model system for negative-sense RNA viruses 45 .

KARR-seq coverage on VSV is roughly five times the coverage on RSV. KARR-seq data revealed three layers of information: (1) higher-order structures of RSV and VSV RNAs; (2) effects of virus infection on the higher-order structures of the host transcriptome; and (3) RNA–RNA interactions between viral and host RNAs. Both RSV and VSV are non-segmented negative-sense RNA viruses and share a similar genome organization. However, one unique feature in the RSV RNA genome is the presence of a G/C-rich G gene. KARR-seq detected higher-order structures of both RSV and VSV RNAs. Interestingly, within the RSV RNA, identified interactions are clustered around the G gene and are mostly short-ranged (<2 Kb) (Fig. 5a , upper panel). In contrast, the VSV RNA includes substantially more long-range stem–loop interactions (Fig. 5a , lower panel).

figure 5

a , Loop and stripe structures across the RSV (top) and VSV (bottom) RNAs in infected A549 cells. b , c , KARR-seq arc groups for the NUCB1 ( b ) and EWSR1 ( c ) transcripts in control and RSV-infected A549 cells. Folding index: 0.415 for NUCB1 after RSV infection, 0.527 for NUCB1 without infection, 0.411 for EWSR1 after RSV infection and 0.532 for EWSR1 without infection. d , RNA folding index in control, RSV-infected and VSV-infected A549 cells. n denotes the number of chimeric read level folding index. n  = 1,772,734 for no infection, n  = 596,451 for RSV and n  = 159,725 for VSV. The lower and the upper bounds denote 25th and 75th percentiles, respectively. The minima denote the lower bound −1.5× IQR. The maxima denote the upper bound +1.5× IQR. P values were calculated by the two-sided Mann–Whitney test. e , f , The number of host RNAs from each RNA category that interact with RSV ( e ) and VSV ( f ) RNAs. g , Fluorescent imaging of GFP-tagged RSV and GFP-tagged VSV after cells were transfected with denoted LNA ASOs. These ASOs target mRNA transcripts at positions that interact with RSV RNA. Scale bar, 100 µm. h , i , The percentage of RSV-positive ( h ) and VSV-positive ( i ) cells quantified by flow cytometry after cells were treated with denoted LNA ASOs. Data are mean ± s.d. n  = 3 biologically replicates. P values were calculated by two-tailed Student’s t -test. IQR, interquartile range.

We next analyzed how RSV and VSV infection could affect the higher-order structures of the host transcriptome. We found that RSV infection resulted in a reduction of intramolecular interactions of the host mRNA in A549 cells (Fig. 5b–d ). This effect was not observed in VSV-infected cells. Because translation could suppress global mRNA compaction, the differences between mRNA intramolecular interactions in RSV-infected and VSV-infected cells are potentially related to a rapid host translation shutdown upon VSV but not RSV infection 46 , 47 .

Both RSV and VSV RNAs interact with host mRNAs and ncRNAs (Fig. 5e,f ). Host microRNAs and small nuclear RNAs (snRNAs) have been demonstrated to bind viral genomes to regulate virus life cycles and host RNA metabolism 7 , 48 . However, interactions between viral RNA and host mRNAs have not been well documented. KARR-seq detected 111 and 664 host mRNA transcripts that interact with the RSV and VSV RNAs, respectively (Supplementary Fig. 14a,b ). Different from the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA that preferentially interacts with the CDS of host mRNAs 49 , both RSV-mediated and VSV-mediated interactions are enriched at the 3′ UTR of host mRNAs (Supplementary Fig. 14c ). The interactions between RSV and VSV RNAs with host RNAs predominantly occurs through N , G and L genes (Supplementary Fig. 14d,e ). Although RSV and VSV infections activate similar pathways in A549 cells (Supplementary Fig. 14f,g ), mRNA transcripts that interact with RSV and VSV RNAs enriched distinct functions. RSV-interacting transcripts are involved in responses to cytokine and regulation of apoptosis (Supplementary Fig. 14h ), whereas VSV-interacting transcripts regulate RNA processing, translation, decay and protein targeting (Supplementary Fig. 14i ).

We next investigated the potential functional relevance of host mRNA–RSV interactions. We focused on mRNAs that are related to cytokine-mediated signaling and apoptosis and designed ASOs that contain locked nucleotides (LNA ASOs) to target these mRNAs in A549 cells, to block specific RNA–RNA interactions. We infected the cells with GFP-tagged RSV or VSV and assayed virus replication by measuring GFP signals in ASO-treated cells. As shown by fluorescent imaging and quantification by flow cytometry, LNA ASOs targeting KANK2 and CD44 mRNA repressed RSV replication by more than 60% (Fig. 5g,h ). In the meantime, these ASOs showed minor effects on VSV replication (Fig. 5g,i ), supporting that the repression effect is RSV specific and likely from the blockage of the corresponding RNA–RNA interactions. These results confirmed the capability of KARR-seq in identifying functional intermolecular RNA–RNA interactions in infectious disease models. Instead of passively binding to the most abundant host transcripts, different viral RNAs interact with host mRNA with diverse functions, suggesting roles of virus-specific RNA–RNA interactions in regulating virus propagation. Future systematic studies are required to reveal the exact molecular mechanisms.

How different RNA molecules interact with each other and assemble into higher-order structures has been a long-standing question. Psoralen-mediated and RBP-mediated approaches have been widely applied with successes 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 ; nonetheless, complementary methods to improve sensitivity and analyze RNAs in all cellular compartments are still in demand. A SHAPE-based bi-functional crosslinker was recently developed to map RNA–RNA interactions, but its optimal application was observed in a locus-specific manner using purified RNA in vitro 50 . Taking advantage of kethoxal-mediated RNA functionalization and multifunctional chemical crosslinkers, we developed KARR-seq to detect higher-order RNA structures and pairwise interactions between various RNA categories. KARR-seq not only captures stable RNA base-pairing contacts but also identifies transient RNA–RNA interactions or proximity that do not correspond to secondary structures. Although we focused on RNA proximity in this work, the dendrimer-mediated crosslinking platform could have other applications, such as studying RNA–chromatin interactions and the mechanisms of RNA-mediated compartment assembly 51 .

We noticed that the median physical distances captured by G1 and G7 do not always match the diameters of the dendrimers. In addition, the KARR-seq contact frequency map and the cryo-EM physical distance map of 18S show evident differences. Although the cryo-EM structures reveal the most thermodynamically stable conformations in vitro, RNAs may fold differently in live cells. Given the transient and dynamic nature of many RNA–RNA interactions within cells, it is plausible that these interactions might be effectively captured only when the sizes of crosslinkers match the spatial distances between two RNA fragments. For large dendrimers such as G7, we attached DBCO to multiple surface amine groups. The distance between two DBCO groups can be smaller than the diameter of the dendrimer, enabling the crosslinking of RNAs at closer proximities.

KARR-seq identifies intermolecular RNA–RNA interactions between various RNA categories, including those responsible for the maintenance of pre-rRNA processing and those that may regulate virus metabolism in the host cells. These examples demonstrate the value of KARR-seq in revealing RNA functions through RNA–RNA interactions. The molecular bases behind these functions require comprehensive mechanistic studies. When coupled with protein or RNA enrichment techniques, KARR-seq can be further tailored to study specific biological processes by enriching rare RNA–RNA interactions mediated by specific RBPs and transcripts.

Cell culture

A549 (American Type Culture Collection (ATCC), CCL-185), HEp-2 (ATCC, CCL-23), Vero CCL81 (ATCC, CRLCCL81), HEK293T (ATCC, CRL11268) and HepG2 (ATCC, HB8065) cells were maintained in DMEM (Gibco, 11995) with 10% (v/v) FBS (Gibco) and penicillin–streptomycin. F123 mESCs (Bing Ren laboratory, University of California, San Diego) were cultured in DMEM (Gibco, 11995) with 10% (v/v) FBS, 1 mM L-glutamine (Gibco), 0.1 mM β-mercaptoethanol (Gibco), 1% (v/v) non-essential amino acid stock (100×, Gibco), 1,000 U ml −1 LIF (Millipore) and penicillin–streptomycin. K562 cells (ATCC, CCL243) were cultured in RPMI 1640 (Gibco 11875) with 10% (v/v) FBS and penicillin–streptomycin. Drosophila S2 cells (Thermo Fisher Scientific, R69007) were maintained in Schneider’s Drosophila media supplemented with 10% (v/v) FBS, 0.1% Pluronic F-68 (Gibco) and penicillin–streptomycin. Drosophila S2 cells were grown at 28 °C. The other cells were grown at 37 °C with 5% CO 2 .

To inhibit translation, cells were treated with 2 µg ml −1 harringtonine (Abcam, ab141941) or 100 µg ml −1 cycloheximide (Abcam, ab120093) for 30 min before being harvested. To induce SGs, K562 cells were treated with 200 µM sodium arsenite (Sigma-Aldrich, S7400) for 1 h before being harvested. siRNAs were transfected using Lipofectamine RNAiMAX (Invitrogen, 13778150). After transfection, cells were cultured for 2–3 d before being harvested.

Virus inoculation

Recombinant human RSV A2 strain expressing GFP (rgRSV) was grown and titered in HEp-2 cells. VSV Indiana strain expressing GFP (rVSV-GFP) was grown and titered in Vero CCL81 cells. Confluent A549 cells in T25 flasks were infected with rgRSV or rVSV-GFP at a multiplicity of infection (MOI) of 0.1. After 1 h of adsorption, the inoculum was removed, and fresh DMEM with 2% FBS was added. Infected cells were inoculated at 37 °C. rVSV-GFP-infected cells were fixed at 24 h after infection, and rgRSV-infected cells were fixed at 48 h after infection for KARR-seq.

Western blot

Protein samples were subjected to SDS-PAGE and transferred to a nitrocellulose membrane (Bio-Rad). The membranes were then blocked in 5% milk in PBS with 0.1% Tween 20 (PBST) and subjected to overnight incubation with primary antibodies in 5% milk in PBST at 4 °C. Membranes were washed five times with PBST and then incubated with secondary antibodies for 1 h at room temperature. Membranes were then washed five times before being applied with enhanced chemiluminescence (ECL) and developed. Antibody information is as follows: rabbit anti-SRSF1 (Bethyl Laboratories, A302-052A, 1:1,000); rabbit anti-histone H3 (Cell Signaling Technology, 9717, 1:1,000); rabbit anti-β-tubulin (Cell Signaling Technology, 2146, 1:1,000); rabbit anti-U1-70K (Abcam, ab83306, 1:1,000); rabbit anti-YBX3 (GeneTex, GTX130052, 1:1,000); mouse anti-GAPDH (Proteintech, HRP-60004, 1:20,000); and anti-rabbit IgG, HRP-linked (Cell Signaling Technology, 7074, 1:2,500).

Purified RNA (1 µl) was loaded onto the Amersham Hybond-N+ membrane (GE Healthcare, RPN119B). Membranes were air dried and crosslinked twice using UV Stratalinker 2400 at 150 mJ cm − 2 . Membranes were then blocked overnight in 5% fatty-acid-free BSA in PBST. Membranes were then washed and incubated in streptavidin-HRP (Thermo Fisher Scientific, S-911) in PBST supplemented with 3% fatty-acid-free BSA. The membrane was washed five times before being applied with ECL and developed.

Synthesizing DBCO and biotin-modified dendrimers

PAMAM dendrimer G1 (1.53 µmol; Sigma-Aldrich, 412384), DBCO-NHS (3.06 µmol; Sigma-Aldrich, 761524), biotin-NHS (1.53 µmol; Sigma-Aldrich, 203112) and triethylamine (5 µl; Sigma-Aldrich, 471283) were added to 2 ml of methanol. The reaction mixture was stirred overnight at room temperature before 100 µl of acetic anhydride (Sigma-Aldrich, 320102) and 100 µl of triethylamine were added. The reaction mixture was stirred for another 24 h at room temperature. The dendrimer solution was purified and concentrated with Microsep Advance Centrifugal Devices with Omega Membrane 1K (Pall Corporation, MCP001C41). DBCO-modified and biotin-modified G3, G5 and G7 were synthesized following a similar procedure, with four equivalent DBCO added for G3, 16 equivalent DBCO added for G5 and 64 equivalent DBCO added for G7.

Characterization and quantification of modified dendrimer were performed by measuring the characteristic UV absorbance of the DBCO moiety at 295 nm. A series of DBCO-NHS solutions with known concentrations were prepared as external standards to generate a calibration curve.

A detailed KARR-seq protocol is included as Supplementary Protocol. In brief, cells were crosslinked in 1% formaldehyde for 10 min. For each KARR-seq sample, 2–5 million cells were resuspended into 500 µl of permeabilization buffer (10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 0.2% IGEPAL CA-630, 5 mM EDTA) supplemented with 2 mM N 3 -kethoxal, proteinase inhibitor and RNase inhibitor. Rotate the mixture at room temperature for 30 min. The cells were washed once with permeabilization buffer and then resuspended in 500 µl of permeabilization buffer supplemented with 12.5 µM dendrimers. The click reaction was performed at 37 °C for 1 h with shaking. To isolate RNA, wash and resuspended cell pellets in 410 µl of 25 mM K 3 BO 3 , 50 µl of 10% SDS, 30 µl of proteinase K (Thermo Fisher Scientific, 25530049) and 10 µl of SUPERNase inhibitor (Thermo Fisher Scientific, AM2696). The mixture was shaken at 55 °C for 2 h and subjected to phenol–chloroform extraction and ethanol precipitation.

Dissolve precipitated RNA in 105 µl of 25 mM K 3 BO 3 , 12 µl of 10× DNase I buffer (Thermo Fisher Scientific, AM8170G), 2 µl of DNase I (Thermo Fisher Scientific, 18047019) and 1 µl of SUPERNase inhibitor. The mixture was gently shaken at 37 °C for 30 min. Then, 130 µl of 2× proteinase K buffer (100 mM Tris-HCl, pH 7.5, 200 mM NaCl, 2 mM EDTA, 1% SDS) and 10 µl of proteinase K were then added. The mixture was shaken at 55 °C for another 30 min, followed by phenol–chloroform extraction and ethanol precipitation. Precipitated RNA was dissolved in a mixture of 63 µl of 25 mM K 3 BO 3 and 7 µl of RNA fragmentation buffer (Thermo Fisher Scientific, AM8740). The mixture was heated at 70 °C for 15 min before the reaction was quenched.

For each sample, block 30 µl of Dynabeads Myone Streptavidin C1 (Thermo Fisher Scientific, 65001) at room temperature with 100 µl of 1× binding/wash buffer (5 mM Tris-HCl, pH 7.4, 0.5 mM EDTA, 1 M NaCl, 0.05% Tween 20) containing 1 µg µl −1 BSA (New England Biolabs (NEB), B9000S) and 1 µg µl −1 salmon sperm DNA (Thermo Fisher Scientific, 15632011) for 30 min. Beads were washed and resuspended in 80 µl of 2× binding/wash buffer (10 mM Tris-HCl, pH 7.4, 1 mM EDTA, 2 M NaCl, 0.1% Tween 20) and incubated with the fragmented RNA at room temperature for 20 min with rotation. Beads were washed twice with 100 µl of 1× binding/wash buffer and once with 100 µl of 1× PNK buffer (diluted from 10×; NEB, M0201L).

Resuspend the beads in 41 µl of 25 mM K 3 BO 3 . Add 5 µl of 10× T4 PNK buffer, 3 µl of T4 PNK (NEB, M0201L) and 1 µl of SUPERNase inhibitor. Shake the beads at 37 °C for 30 min. Then, add 1 µl of 10× T4 PNK buffer, 3 µl of T4 PNK and 6 µl of 10 mM ATP. Shake the tube at 37 °C for another 30 min. The beads were then washed twice with 1× binding/wash buffer and once with 1× ligation buffer (diluted from 10×; NEB, M0437M). Resuspend beads in 673 µl of 25 mM K 3 BO 3 , add 100 µl of 10× T4 RNA ligase buffer, 2 µl of 10 mM ATP, 200 µl of 50% PEG 8000, 20 µl of T4 RNA ligase I (NEB, M0437M) and 5 µl of SUPERNase inhibitor. Shake the reaction mixture at 16 °C overnight.

After ligation, wash beads three times with 1× binding/wash buffer. Elute RNA by heating the beads in 50 µl of water at 95 °C for 10 min. Purify the RNA using the RNA Clean & Concentrator Kit (Zymo Research, R1014). Then, 10 ng RNA was used for library construction using the SMARTer Stranded Total RNA-Seq Kit v2–Pico Input Mammalian (Takara, 634413). Libraries were sequenced on the Illumina NovaSeq platform, PE150 mode, with around 100 million reads per sample.

Cell nuclei isolation

Ten million K562 cells were collected and washed once with 1 ml of ice-cold PBS with 1 mM EDTA. Then, 200 μl of ice-cold lysis buffer (10 mM Tris-HCl, pH 7.5, 0.1% NP40, 150 mM NaCl) was added to washed cell pellet, followed by incubation on ice for 5 min. Next, 500 μl of chilled sucrose cushion (24% sucrose in lysis buffer) was gently added below the lysate. The tube was centrifuged at 4 °C at 15,000 g for 10 min, and the supernatant was discarded. Then, 200 μl of ice-cold PBS with 1 mM EDTA was gently added to the nuclei pellet without dislodging the pellet and was then aspirated. Around 5 million nuclei were used for each KARR-seq experiment. Nuclei were crosslinked as described above.

KARR-seq data processing

SeqPrep was first used to merge overlapping read pairs into single-end reads. Reads that failed to merge were excluded from the analysis. The resulting FASTQ files were mapped to both the genome and transcriptome (mm10, hg19 or dm3) using STAR with PCR duplications removed 52 . STAR were used to recover gapped and chimeric reads (--runMode alignReads --outFilterMultimapNmax 100 --outSAMattributes all --alignIntronMin 1 --scoreGapNoncan -4 --scoreGapATAC -4 --chimSegmentMin 15 --chimJunctionOverhangMin 15 --limitOutSJcollapsed 10000000 --limitIObufferSize 1500000000). Gapped reads were defined as reads containing at least one N entry in the CIGAR string. For alignments done using the reference genome, an extra filtering step was performed to remove gapped reads that overlap known splicing junctions. Gapped reads were combined with chimeric reads from the <sample>_Chimeric.out.sam output file to form the final chimeric reads set. Processed results were stored and indexed in the pairix format for efficient range-based querying. Identical processing steps described above were applied for RIC-seq data. For PARIS data, readCollapse.pl was performed to collapse reads with unique molecular identifier (UMI) barcodes. KARR-seq data processing statistics are included in Supplementary Table 1 .

To identify differential chimeric groups between conditions, we used DESeq2 (ref. 53 ) and applied median normalization and Wald’s test. Differential chimeric groups were reported as significant if the adjusted P  < 0.05.

Comparison to physical distance maps

We used the 18S physical distance map to calibrate and infer the physical distance proximity. We calculated physical distances of 18S at a granularity of 5-nucleotide (nt) resolution by calculating the Euclidean distance (L2 normalization) between the centroid positions of each 5-nt window using the cryo-EM structure 20 . To tabulate the inferred physical distance distribution from KARR-seq, PARIS and RIC-seq, we looked up the physical distance value from the 5-nt pairwise physical distance table based on the midpoints from either arm of the 18S chimeric reads. A Gaussian kernel was applied for the kernel density estimate of the physical distance distribution.

For 28S, U3 and TERC , the consistency between physical distance maps and contact maps for each transcript was evaluated using ROC–AUC curves, with physical distance maps binarized using a gamma distribution. The gamma distribution was parameterized as α = 6 and β = µ/α, where the mean physical distance µ = 20 Å.

Chimeric reads clustering and the identification of loops and stripes

To identify loops and stripes, we clustered chimeric reads and categorized them in an unsupervised fashion. We first calculated the pairwise distance using a custom overlapping similarity function:

We then built a graph from the similarity matrix and detected clusters of chimeric reads to define RNA–RNA interactions using the Clauset–Newman–Moore modularity-based method. For each interaction, a minimum of three chimeric reads must be present to support clustering. Representative positions (left, right) of interactions were defined as interaction anchors. Left anchors were defined using the median of all left-arm starting positions, and right anchors were defined using the median of all right-arm ending positions.

Chimeric groups were clustered as stable structures (loops) and dynamic structures (stripes). Loops were defined as interactions with both interacting fragments having fixed positions or limited variability in their positioning. Stripes were defined as interactions with a fixed position on one end and variable interacting positions on the other end. To characterize loops and stripes, we computed the coefficient of variation (CV) for each anchor. Chimeric groups with CV < 1.0 on both anchors were defined as loops. Chimeric groups with CV < 1.0 only on the right or left anchor were defined as right or left stripes, respectively.

Secondary structure prediction by RNAcofold

RNAcofold (ViennaRNA version 2.4.3) was used to make secondary structure predictions and calculate their associated MFE values at interaction anchors. For comparison with the MFE calculated at all interaction anchors, a background set was generated using BEDtools (version 2.26.0) to shuffle the anchor positions while maintaining the same span distribution and proportion of anchors present on the transcript (bedtools shuffle -chrom).

Calculating folding index

The folding distance was defined as the genomic distance between the left arm (start position) of the chimeric read to the right arm (start position) of the chimeric read. To quantify the levels of RNA compaction and to account for differences between transcripts of varying lengths, abundance and experimental conditions, we devised a folding index, a bounded transformation of the span distance. We score the span distance between a bounded interval of 0 to 1, where a value of 0 suggests rigidity and 1 suggests a tendency to compact through folding.

where s is the span distance, and \(\alpha =\frac{1}{200}\) .

This is a monotonically increasing curve that increases linearly and as a function of span distance before converging to 1. It effectively down-weighs excessively long transcriptomic spans. α was parameterized to 1/200 to represent a 150-nt distance at folding index = 0.5. This metric can be applied to a certain transcript, a given region within a transcript or transcriptome wide.

Ribosome profiling data processing

Pre-processing of ribosome profiling sequences was done using Cutadapt (cutadapt –q 30 –a AGATCGGAAGAGCACACGTCT –u 3 –minimum-length 10). The first three bases from the 5′ end were clipped. Sequences were then aligned using BWA with default parameters. Genome coverage files were created using BEDtools (bedtools genomecov –bga) and converted into UCSC bigWig format. Ribosome stalling sites were identified using a model-free approach with CTK 54 (perl tag2peak.pl -big –ss –v –valley-seeking –valley-depth 0.9).

Tertiary structure modeling

We modeled tertiary structures and calculated the expected distance based on the FJC model.

where x is the end-to-end distance between two points.

Average end-to-end distance (E2E) between two points:

where N is the number of monomers/nucleotides between two points, and \(l\) is the fixed length of an RNA nucleotide.

We incorporated secondary structure and MFE into our tertiary structure modeling. Using secondary structure dot-bracket notation, we first applied forgi’s BulgeGraph (ViennaRNA) class to construct the graph. We then constructed the two-dimensional (2D) tree graph, G mst , by finding the minimum spanning tree using Kruskal’s algorithm.

Next, we constructed expected physical distance maps of the tree graph by incorporating the E2E average physical distance function. The physical distance maps were used to restrain the polymer model constructed using the IMP framework 55 . The final polymer chain was output in ( x , y , z ) coordinates, and the simulated physical distance maps were derived using Euclidean distance.

RBP eCLIP analysis

Processed eCLIP BAM files were retrieved from ENCODE. To determine the crosslinking-induced termination sites (CITSs), we summed the counts of the pileups flanking 7 nucleotides upstream and downstream of a given locus X i . For each transformed value T i , we estimated C i as the local Poisson parameter and infer it using the means of C i−100 to C i+100 . P values were then computed for each locus i and fitted to a beta-uniform mixture model to correct for multiple testing. For each locus i, a posterior probability was calculated to determine whether the P value is more likely to follow the beta (significant) or uniform (random) distribution. We used a posterior probability greater than 90% as the threshold for calling significant loci. Consecutive loci greater than the length of 15 were concatenated.

We called CITS on the respective eCLIP controls samples to remove artifacts. We merged the CITS calls by taking only their common intersecting region to yield a stringent eCLIP call set.

Identification of snoRNA–rRNA interactions

We used BWA-mem (bwa mem -M) to map all chimeric segments to identify intermolecular interactions. Chimeric reads were aligned to the canonical transcriptome, with the longest spliced isoforms selected to represent corresponding gene entries. For each C/D box snoRNA, we computed the pile-up coverage of chimeric segments across the transcript length of either 18S or 28S using BEDtools (bedtools genomecov -bg). Regions with systematically high coverages across all snoRNA–rRNA pairs and did not overlap known 2′-OMe sites were deemed as backgrounds. SciPy’s find_peak function was applied to remove background and identify snoRNA–rRNA peaks using the following parameters: width = 50, distance = 50, height = 1 × 10 −4 .

Differential gene expression and gene set enrichment analysis

For RSV-infected and VSV-infected samples, gene counts for mock and infected samples were tabulated using the non-chimeric reads of KARR-seq. Differential gene expression analysis was performed between mock and viral-infected samples using DESeq2 (ref. 53 ). Gene set enrichment analyses (GSEAs) were performed using GSEApy 56 using transcripts with (1) host–viral interactions (with at least 100 host–viral chimeric reads) and (2) differentially expressed transcripts (adjusted P  < 0.01 and log 2 fold change (FC) > 1.5 and log 2 FC < −1.5).

Preparation of FISH probes

FISH probes were designed using Stellaris Probe Designer (version 4.2) and purchased from Integrated DNA Technologies with 5′ amine modifications. For each transcript, we designed 20 probes to target each ends. Pooled probes targeting the 5′ end and the 3′ end were labeled with Alexa Fluor 647 (AF647) NHS Ester (Thermo Fisher Scientific, A20006) and Cy3B NHS ester (Cytiva, PA63101) following the previously published protocol 57 . Labeled probes were purified by ethanol precipitation, followed by running through P6 Micro Bio-Spin columns (Bio-Rad, 7326228). The probe concentration and labeling efficiency were determined using a BioSpectrometer (Eppendorf).

RNA FISH was performed following a published protocol 58 . HepG2 cells were first fixed and permeabilized. Then, 100 µl of 1 nM FISH probes diluted in the hybridization buffer (10% dextran sulphate and 10% formamide in 2× SSC) were applied to cells, and the cells were incubated for 16 h at 37 °C. The cells were then washed twice with FISH wash buffer (10% formamide in 2× SSC) for 30 min before imaging.

Samples were imaged in an imaging buffer containing 50 mM Tris-HCl (pH 8.0), 10% glucose, 2× SSC, 0.5 mg ml −1 glucose oxidase (Sigma-Aldrich) and 67 µg ml −1 catalase (Sigma-Aldrich). Imaging was performed using a Nikon TiE microscope with a CFI HP TIRF objective (×100, NA = 1.49, Nikon) and an EMCCD (Andor, iXon Ultra 888). Signals were acquired using a 647-nm laser (Cobolt MLD) and a 561-nm laser (Coherent Obis).

RNA FISH data analysis

Spot detection was separately performed in the AF647 and Cy3B channels using the ImageJ (1.53n) plugin ThunderSTORM (version 1.3) 59 in 2D. The ThunderSTORM was configured with the following parameters. Image filtering: wavelet filter with a third-order B-spline function and a scaling factor of 2; approximate localization of molecules: local maximum method with 8-neighborhood connectivity and a peak intensity threshold set to three times of the standard deviation of the first wavelet level; sub-pixel localization of molecules: PSF (Integrated Gaussian) method with a maximum likelihood fitting method, a fitting radius of 5 pixels (pixel size = 130 nm) and an initial sigma of 1.6 pixels; and XY uncertainty threshold: 40 nm.

Inter-RNA distances were subsequently calculated using a custom MATLAB (R2021b) code. Chromatic aberration was corrected using TetraSpeck Microspheres (Thermo Fisher Scientific). AF647 and Cy3B spot pairs with a center-to-center distance of less than 300 nm were considered signals from the same mRNA molecule.

Analysis of pre-rRNA level after ASO blockage

Sequences for ASOs and qPCR primers are included in Supplementary Table 2 . ASOs were delivered into K562 cells by electroporation using an SF Cell Line 4D-Nucleofector X Kit (Lonza, V4XC-2024). After 48 h, cells were collected for RNA purification using TRIzol reagent (Thermo Fisher Scientific, 15596026). Purified RNA was then applied to reverse transcription, followed by qPCR.

Validation of snoRNA–18S interactions

Sequences for biotinylated ASOs and qPCR primers are included in Supplementary Table 2 . K562 cells were treated as indicated in the KARR-seq procedure. Total RNA was then isolated from the cells, fragmented and purified as above. Five percent of the fragmented RNA was saved as input. The rest was split into portions and mixed with biotinylated ASOs targeting different rRNA regions in the hybridization buffer (75 mM HEPES, pH 7.0, 150 mM KCl, 25 mM K 3 BO 3 ). The mixture was denatured, followed by a gradual temperature decrease to 25 °C for re-hybridization. The mixture was then subjected to enrichment using 30 µl of Dynabeads Myone Streptavidin C1 (Thermo Fisher Scintific, 65001). Enriched RNA and the corresponding input samples were subjected to RT–qPCR.

Analysis of RSV and VSV replication after ASO blockage

ASOs (Supplementary Table 2 ) were transfected using Lipofectamine RNAiMAX. In brief, A549 cells in 24-well plates were transfected with 80 nM ASO for 24 h, followed by virus infection with rgRSV or rVSV-GFP at the MOI of 0.1. After incubation at 37 °C for 1 h, the virus inoculum was removed, and fresh DMEM with 2% FBS was added. The infected cells were incubated at 37 °C. GFP expression was monitored by fluorescence microscopy and flow cytometry using an Attune NxT Flow Cytometer (Thermo Fisher Scientific). Flow cytometry data were analyzed using FlowJo (10.0.7) software.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Public sequencing data used in this study were acquired from the Gene Expression Omnibus (GEO) with accession numbers as follows: GSE127188 (RIC-seq, HeLa) 12 ; GSE74353 (PARIS, HEK293T) 4 ; GSE138058 (G3BP1-APEX-seq) 32 ; and GSE121952 (ribosome profiling) 60 . Cryo-EM structures were acquired from the Protein Data Bank (accession codes: 7QXB for TERC 61 , 6QX9 for U3 (ref. 62 ) and 4V6X for 18S and 28S 20 ). RBP eCLIP BAM files were accessed from https://www.encodeproject.org/ . Raw and analyzed data for all sequencing experiments have been deposited at the GEO ( https://www.ncbi.nlm.nih.gov/geo/ ) under accession number GSE166155 (ref. 63 ).

Code availability

The source code of KARR-seq is freely available at https://github.com/ouyang-lab/KARR-seq (ref. 64 ).

Hafner, M. et al. CLIP and complementary methods. Nat. Rev. Methods Primers 1 , 20 (2021).

Article   CAS   Google Scholar  

Ganser, L. R., Kelly, M. L., Herschlag, D. & Al-Hashimi, H. M. The roles of structural dynamics in the cellular functions of RNAs. Nat. Rev. Mol. Cell Biol. 20 , 474–489 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Engreitz, J. M. et al. RNA–RNA interactions enable specific targeting of noncoding RNAs to nascent pre-mRNAs and chromatin sites. Cell 159 , 188–199 (2014).

Lu, Z. et al. RNA duplex map in living cells reveals higher-order transcriptome structure. Cell 165 , 1267–1279 (2016).

Sharma, E., Sterne-Weiler, T., O’Hanlon, D. & Blencowe, B. J. Global mapping of human RNA–RNA interactions. Mol. Cell 62 , 618–626 (2016).

Article   CAS   PubMed   Google Scholar  

Aw, J. G. A. et al. In vivo mapping of eukaryotic RNA interactomes reveals principles of higher-order organization and regulation. Mol. Cell 62 , 603–617 (2016).

Ziv, O. et al. COMRADES determines in vivo RNA structures and interactions. Nat. Methods 15 , 785–788 (2018).

Kudla, G., Granneman, S., Hahn, D., Beggs, J. D. & Tollervey, D. Cross-linking, ligation, and sequencing of hybrids reveals RNA–RNA interactions in yeast. Proc. Natl Acad. Sci. USA 108 , 10010 (2011).

Ramani, V., Qiu, R. & Shendure, J. High-throughput determination of RNA structure by proximity ligation. Nat. Biotechnol. 33 , 980–984 (2015).

Nguyen, T. C. et al. Mapping RNA–RNA interactome and RNA structure in vivo by MARIO. Nat. Commun. 7 , 12023 (2016).

Metkar, M. et al. Higher-order organization principles of pre-translational mRNPs. Mol. Cell 72 , 715–726 (2018).

Cai, Z. et al. RIC-seq for global in situ profiling of RNA–RNA spatial interactions. Nature 582 , 432–437 (2020).

Çetin, B., Song, G. J. & O’Leary, S. E. Heterogeneous dynamics of protein–RNA interactions across transcriptome-derived messenger RNA populations. J. Am. Chem. Soc. 142 , 21249–21253 (2020).

Article   PubMed   PubMed Central   Google Scholar  

Weng, X. et al. Keth-seq for transcriptome-wide RNA structure mapping. Nat. Chem. Biol. 16 , 489–492 (2020).

You, Q. et al. Direct DNA crosslinking with CAP-C uncovers transcription-dependent chromatin organization at high resolution. Nat. Biotechnol. 39 , 225–235 (2021).

Zhang, M. et al. Optimized photochemistry enables efficient analysis of dynamic RNA structuromes and interactomes in genetic and infectious diseases. Nat. Commun. 12 , 2344 (2021).

Mamidyala, S. K. & Finn, M. G. In situ click chemistry: probing the binding landscapes of biological molecules. Chem. Soc. Rev. 39 , 1252–1261 (2010).

Wu, T., Lyu, R., You, Q. & He, C. Kethoxal-assisted single-stranded DNA sequencing captures global transcription dynamics and enhancer activity in situ. Nat. Methods 17 , 515–523 (2020).

Astruc, D., Boisselier, E. & Ornelas, C. Dendrimers designed for functions: from physical, photophysical, and supramolecular properties to applications in sensing, catalysis, molecular electronics, photonics, and nanomedicine. Chem. Rev. 110 , 1857–1959 (2010).

Anger, A. M. et al. Structures of the human and Drosophila 80S ribosome. Nature 497 , 80–85 (2013).

Chakrabarti, A. M., Iosub, I. A., Lee, F. C. Y., Ule, J. & Luscombe, N. M. A computationally-enhanced hiCLIP atlas reveals Staufen1-RNA binding features and links 3′ UTR structure to RNA metabolism. Nucleic Acids Res. 51 , 3573–3589 (2023).

Sun, L. et al. RNA structure maps across mammalian cellular compartments. Nat. Struct. Mol. Biol. 26 , 322–330 (2019).

Ouyang, Z., Zhou, Q. & Wong, W. H. ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells. Proc. Natl Acad. Sci. USA 106 , 21521–21526 (2009).

Sanchez de Groot, N. et al. RNA structure drives interaction with proteins. Nat. Commun. 10 , 3246 (2019).

Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583 , 711–719 (2020).

Adivarahan, S. et al. Spatial organization of single mRNPs at different stages of the gene expression pathway. Mol. Cell 72 , 727–738 (2018).

Hinnebusch, A. G. & Lorsch, J. R. The mechanism of eukaryotic translation initiation: new insights and challenges. Cold Spring Harb. Perspect. Biol. 4 , a011544 (2012).

Khong, A. & Parker, R. mRNP architecture in translating and stress conditions reveals an ordered pathway of mRNP compaction. J. Cell Biol. 217 , 4124–4140 (2018).

Vicens, Q., Kieft, J. S. & Rissland, O. S. Revisiting the closed-loop model and the nature of mRNA 5′–3′ communication. Mol. Cell 72 , 805–812 (2018).

Ivanov, P., Kedersha, N. & Anderson, P. Stress granules and processing bodies in translational control. Cold Spring Harb. Perspect. Biol. 11 , a032813 (2019).

Tauber, D. et al. Modulation of RNA condensation by the DEAD-box protein eIF4A. Cell 180 , 411–426 (2020).

Somasekharan, S. P. et al. G3BP1-linked mRNA partitioning supports selective protein synthesis in response to oxidative stress. Nucleic Acids Res. 48 , 6855–6873 (2020).

Kedersha, N. L., Gupta, M., Li, W., Miller, I. & Anderson, P. RNA-binding proteins Tia-1 and Tiar link the phosphorylation of Eif-2α to the assembly of mammalian stress granules. J. Cell Biol. 147 , 1431–1442 (1999).

Gilks, N. et al. Stress granule assembly is mediated by prion-like aggregation of TIA-1. Mol. Biol. Cell 15 , 5383–5398 (2004).

Kedersha, N. et al. G3BP–Caprin1–USP10 complexes mediate stress granule condensation and associate with 40S subunits. J. Cell Biol. 212 , 845–860 (2016).

Lestrade, L. & Weber, M. J. snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Res. 34 , D158–D162 (2006).

Henras, A. K., Plisson-Chastang, C., O’Donohue, M.-F., Chakraborty, A. & Gleizes, P.-E. An overview of pre-ribosomal RNA processing in eukaryotes. Wiley Interdiscp. Rev. RNA 6 , 225–242 (2015).

Dragon, F. et al. A large nucleolar U3 ribonucleoprotein required for 18S ribosomal RNA biogenesis. Nature 417 , 967–970 (2002).

Sharma, K. & Tollervey, D. Base pairing between U3 small nucleolar RNA and the 5′ end of 18S rRNA is required for pre-rRNA processing. Mol. Cell. Biol. 19 , 6012–6019 (1999).

Dutca, L. M., Gallagher, J. E. G. & Baserga, S. J. The initial U3 snoRNA:pre-rRNA base pairing interaction required for pre-18S rRNA folding revealed by in vivo chemical probing. Nucleic Acids Res. 39 , 5164–5180 (2011).

Marmier-Gourrier, N., Cléry, A., Schlotter, F., Senty-Ségault, V. & Branlant, C. A second base pair interaction between U3 small nucleolar RNA and the 5′-ETS region is required for early cleavage of the yeast pre-ribosomal RNA. Nucleic Acids Res. 39 , 9731–9745 (2011).

Nicholson, B. L. & White, K. A. Functional long-range RNA–RNA interactions in positive-strand RNA viruses. Nat. Rev. Microbiol. 12 , 493–504 (2014).

Jaafar, Z. A. & Kieft, J. S. Viral RNA structure-based strategies to manipulate translation. Nat. Rev. Microbiol. 17 , 110–123 (2019).

Borchers, A. T., Chang, C., Gershwin, M. E. & Gershwin, L. J. Respiratory syncytial virus—a comprehensive review. Clin. Rev. Allergy Immunol. 45 , 331–379 (2013).

Whelan, S. P. J., Barr, J. N. & Wertz, G. W. Transcription and replication of nonsegmented negative-strand RNA viruses. In Biology of Negative Strand RNA Viruses: The Power of Reverse Genetics (ed. Kawaoka, Y.) 61–119 (Springer, 2004).

Neidermyer, W. J. Jr. & Whelan, S. P. J. Global analysis of polysome-associated mRNA in vesicular stomatitis virus infected cells. PLoS Pathog. 15 , e1007875 (2019).

Groskreutz, D. J., Babor, E. C., Monick, M. M., Varga, S. M. & Hunninghake, G. W. Respiratory syncytial virus limits alpha subunit of eukaryotic translation initiation factor 2 (eIF2α) phosphorylation to maintain translation and viral replication. J. Biol. Chem. 285 , 24023–24031 (2010).

Ziv, O. et al. The short- and long-range RNA–RNA interactome of SARS-CoV-2. Mol. Cell 80 , 1067–1077 (2020).

Yang, S. L. et al. Comprehensive mapping of SARS-CoV-2 interactions in vivo reveals functional virus–host interactions. Nat. Commun. 12 , 5113 (2021).

Christy, T. W. et al. Direct mapping of higher-order RNA interactions by SHAPE-JuMP. Biochemistry 60 , 1971–1982 (2021).

Quinodoz, S. A. et al. RNA promotes the formation of spatial compartments in the nucleus. Cell 184 , 5775–5790 (2021).

Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29 , 15–21 (2013).

Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15 , 550 (2014).

Shah, A., Qian, Y., Weyn-Vanhentenryck, S. M. & Zhang, C. CLIP Tool Kit (CTK): a flexible and robust pipeline to analyze CLIP sequencing data. Bioinformatics 33 , 566–567 (2017).

Russel, D. et al. Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol. 10 , e1001244 (2012).

Fang, Z., Liu, X. & Peltz, G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. Bioinformatics 39 , btac757 (2023).

Gaspar, I., Wippich, F. & Ephrussi, A. Enzymatic production of single-molecule FISH and RNA capture probes. RNA 23 , 1582–1591 (2017).

Raj, A., van den Bogaard, P., Rifkin, S. A., van Oudenaarden, A. & Tyagi, S. Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods 5 , 877–879 (2008).

Ovesný, M., Křížek, P., Borkovec, J., Švindrych, Z. & Hagen, G. M. ThunderSTORM: a comprehensive ImageJ plug-in for PALM and STORM data analysis and super-resolution imaging. Bioinformatics 30 , 2389–2390 (2014).

Huang, H. et al. Histone H3 trimethylation at lysine 36 guides m6A RNA modification co-transcriptionally. Nature 567 , 414–419 (2019).

Sekne, Z., Ghanim, G. E., van Roon, A.-M. M. & Nguyen, T. H. D. Structural basis of human telomerase recruitment by TPP1-POT1. Science 375 , 1173–1176 (2022).

Charenton, C., Wilkinson, M. E. & Nagai, K. Mechanism of 5ʹ splice site transfer for human spliceosome activation. Science 364 , 362–367 (2019).

Wu, T. & Cheng, A. Y. KARR-seq reveals principles of high-order ribonucleoprotein structure assembly. NCBI https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE166155 (2023).

Wu, T. et al. KARR-seq reveals cellular higher-order RNA structures and RNA–RNA interactions. GitHub https://github.com/ouyang-lab/KARR-seq

Download references

Acknowledgements

We thank B. T. Harada, Q. You and P. C. He for critical discussions. We thank B. Graveley for insightful suggestions. This work was supported by the US National Institutes of Health (R01HG012780 and RM1HG008935 to C.H. and R35GM124998 to Z.O.). C.H. is an investigator of the Howard Hughes Medical Institute.

Author information

Anthony Youzhi Cheng

Present address: Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore

These authors contributed equally: Tong Wu, Anthony Youzhi Cheng.

Authors and Affiliations

Department of Chemistry, University of Chicago, Chicago, IL, USA

Tong Wu, Bei Liu, Xiaoyang Dou, Pingluan Wang, Linda Zhang & Chuan He

Howard Hughes Medical Institute, Chicago, IL, USA

Department of Genetics and Genome Sciences and Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA

Department of Biostatistics and Epidemiology, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA, USA

Anthony Youzhi Cheng & Zhengqing Ouyang

Department of Veterinary Biosciences, College of Veterinary Medicine, The Ohio State University, Columbus, OH, USA

Yuexiu Zhang, Jiayu Xu, Xiao Li & Jianrong Li

Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, University of Chicago, Chicago, IL, USA

Jinjun Wu, Li Wen, Jingyi Fei & Chuan He

You can also search for this author in PubMed   Google Scholar

Contributions

C.H. and T.W. conceived the project. T.W. performed experiments, with help from Y.Z., J.X., J.W., L.W., X.L., B.L., P.W. and L.Z. A.Y.C. analyzed high-throughput sequencing data. X.D. assisted with data analysis. J.F., J.L., Z.O. and C.H. supervised the study. T.W., A.Y.C. and C.H. wrote the paper. All authors discussed the results and edited the paper.

Corresponding authors

Correspondence to Zhengqing Ouyang or Chuan He .

Ethics declarations

Competing interests.

The University of Chicago has filed a patent application on KARR-seq. C.H. is a scientific founder, a member of the scientific advisory board and an equity holder of Aferna Green, Inc. and AccuaDX, Inc., and is a scientific co-founder and equity holder of Accent Therapeutics, Inc. T.W. is an equity holder of AccuaDX, Inc. The other authors declare no competing interests.

Peer review

Peer review information.

Nature Biotechnology thanks Qiangfeng Cliff Zhang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary Figs. 1–16 and Protocol.

Reporting Summary

Supplementary tables 1 and 2.

Supplementary Table 1: Sequencing and mapping statistics. Supplementary Table 2: Oligonucleotides used in this work.

Supplementary Data

Differential RNA–RNA interactions between arsenite and control conditions.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wu, T., Cheng, A.Y., Zhang, Y. et al. KARR-seq reveals cellular higher-order RNA structures and RNA–RNA interactions. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02109-8

Download citation

Received : 23 February 2023

Accepted : 15 December 2023

Published : 18 January 2024

DOI : https://doi.org/10.1038/s41587-023-02109-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

data structures for assignment

Technical Articles

Author's profile photo Sefan Linders

Modeling an advanced Hierarchy with Directory in SAP Datasphere

In our previous blog , we covered the basics of creating a Hierarchy with Directory . Now, we’re taking a step further in this blog by enhancing our model with an extra node type , language-dependent texts , and time-dependency , both for the hierarchies and node assignments. These advanced features are often seen in SAP S/4HANA or SAP BW hierarchies but are also applicable to non-SAP hierarchies. We continue to use a simplified data model with local data for clarity, making it easier to grasp these concepts before applying them to more complex, real-world data sources. The additions to the data model are pointed out in Figure 1, with the numbering corresponding to the section numbers in this blog post. The figure shows all views, and each view has a table with data underneath.

This blog is part of a series in which we introduce the Hierarchy with Directory :

  • An Introduction to Hierarchy with Directory
  • Modeling a basic Hierarchy with Directory
  • Modeling an advanced Hierarchy with Directory (this blog)
  • Create SAP S/4HANA GL Account Hierarchy within SAP Datasphere through Community Content
  • Walkthrough of different Enterprise Scenarios via Community Content package – GL Account External Hierarchy with replication Flow
  • More to be added…

Figure%201%3A%20In%20yellow%20all%20extensions%20to%20the%20views%20of%20the%20basic%20model

Figure 1: In yellow all extensions to the views of the basic model

1. Adding another node type

Until now, our node types included Product Category (an inner node type ) and Product ID (the leaf node type). To demonstrate the use of multiple inner node types within a single hierarchy, we extend the model with a Department node type. The idea is that the company that sells these products, has different departments, each responsible for one or more product categories.

To incorporate this new node type, we’ll make the following updates:

  • Expand the hierarchy table by adding a Department ID
  • Insert new entries into the hierarchy table to create a new hierarchy that includes department nodes. We highlight this in Figure 2 with an example, where you can traverse the hierarchy from leaf to root node, starting from the leaf node with Product ID OATMILK_1 . The other hierarchies we leave untouched.
  • Add a new entry to the directory table to denote the new Department hierarchy (see Figure 3).
  • Update the hierarchy view to include the new node type (see Figure 4).
  • Add a few Product IDs and transactions to the data, to make the data preview richer.

Figure%202%3A%20Extending%20the%20hierarchy%20table%20with%20a%20Department%20ID%20column%20and%20new%20hierarchy%20entries

Figure 2: Extending the hierarchy table with a Department ID column and new hierarchy entries

Figure%203%3A%20Adding%20a%20new%20record%20to%20the%20Product%20Hierarchy%20Directory%20table

Figure 3: Adding a new record to the Product Hierarchy Directory table

Figure%204%3A%20Adding%20node%20type%20definition%20for%20Department%20in%20the%20Hierarchy%20view

Figure 4: Adding node type definition for Department in the Hierarchy view

A new data preview shows then the new grouping into departments.

Figure%205%3A%20OATMILK_1%20now%20rolling%20up%20into%20Product%20Categories%2C%20and%20then%20into%20the%20new%20node%20type%20Department

Figure 5: OATMILK_1 now rolling up into Product Categories, and then into the new node type Department

2. Adding dimensions with language-dependent texts to node types

Previously, our inner nodes Product Category ID and Department ID were only defined as just text-based columns . However, these are perfect candidates to be defined as Dimensions . Once associated to the Hierarchy, you can utilize dimension features like text labels , language-dependent texts , and/or time-dependent texts , which are typical in SAP S/4HANA and SAP BW hierarchies.

To illustrate this, we created a Product Category dimension with language-dependent texts. The screenshot below illustrates the required new objects and their association with the Product Hierarchy .

Figure%206%3A%20New%20objects%20for%20Product%20Category%20dimension%20with%20language-dependent%20texts

Figure 6: New objects for Product Category dimension with language-dependent texts

The following steps describe these changes in more detail:

  • Create a Product Category text table with columns for Language and Text .
  • Add sample data to the table. We fill the table for language English (EN) and Dutch (NL) (see Figure 7).
  • Create a Product Category Text view with semantics configured for language and label (see Figure 8).
  • Create Product Category table with key entries.
  • Create a Product Category dimension view, consuming the table, with an association to the text view (see Figure 9).
  • Create an association from the hierarchy view to the new dimension.

Figure%207%3A%20Sample%20data%20for%20Product%20Category%20text%20table

Figure 7: Sample data for Product Category text table

Figure%208%3A%20Semantics%20definition%20for%20the%20Product%20Category%20text%20view

Figure 8: Semantics definition for the Product Category text view

Figure%209%3A%20The%20Product%20Category%20dimension%20view

Figure 9: The Product Category dimension view

When refreshing the data preview of the Analytic Model, you now have the option to display either the ID , the Descriptions , or both (Figure 10), after which the more readable labels show up for the product Category nodes (Figure 11).

Figure%2010%3A%20Choosing%20ID%2C%20Description%20or%20both%20as%20the%20presentation%20mode

Figure 10: Choosing ID, Description or both as the presentation mode

Figure%2011%3A%20Product%20Category%20texts%20are%20now%20displayed

Figure 11: Product Category texts are now displayed

Because we created a dimension with language-dependent texts , we can now switch the data language in the SAP Datasphere setting menu. In below screenshots, you can see that we change the language to Dutch , after which the Product Category nodes show the Dutch translation.

Figure%2012%3A%20Changing%20the%20Data%20Access%20Language

Figure 12: Changing the Data Access Language

Figure%2013%3A%20Product%20Category%20texts%20are%20now%20displayed%20in%20Dutch

Figure 13: Product Category texts are now displayed in Dutch

We repeated the same exercise for Department by creating a new dimension with language-dependent texts and extended the Product dimension by adding language-dependent texts . The result is shown in below screenshot. Please note that to display the texts for leaf nodes, like our Product node, you don’t need to create an association from the hierarchy to the dimension. The existing association from the product dimension to the hierarchy is adequate. For inner nodes this is different, and you will have to create associations from the hierarchy to the dimension.

Figure%2014%3A%20All%20node%20types%20are%20now%20associated%20to%20dimensions%20with%20texts

Figure 14: All node types are now associated to dimensions with texts

Please note that at time of writing, when texts have been added to a dimension, it is not possible anymore to display the technical name of the dimension. When selecting ID as presentation, the Child ID will show up instead. In our data model the Child ID is represented by the field Node ID. How this looks like is illustrated in the screenshot below.

Figure%2015%3A%20Presentation%20mode%20ID%20or%20ID%20and%20Description%20will%20show%20the%20Child%20field%20as%20ID%20and%20not%20the%20technical%20name%20of%20the%20dimension

Figure 15: Presentation mode ID or ID and Description will show the Child field as ID and not the technical name of the dimension

Adding time-dependency to hierarchies, nodes, and texts

Time-dependency is a common requirement in SAP S/4HANA and SAP BW hierarchies, manifesting in several ways:

  • Time-dependency of a hierarchy itself. Determined within the Hierarchy Directory, this aspect decides if and when a hierarchy is available.
  • Time-dependency of node assignments. Defined in the Hierarchy with Directory view, this determines which node is assigned to a parent node within specific time intervals.
  • Time-dependency of dimension texts. Defined in the Text view using date intervals. While this feature was already accessible, it’s now also applicable to a Hierarchy with Directory .

We’ll run through setting up time-dependency for the hierarchy and node assignments.

3. Time-dependency of hierarchies

A hierarchy can be considered active or not, based on an optional time-validity interval. This requires an extension to the previously modeled Product Hierarchy Directory , by adding a Valid From and Valid To field to the Hierarchy Directory table and adjusting the semantics in the Hierarchy Directory view to mark those fields as Business Date From and Business Date To . The below screenshots show the changes we made for the Product Hierarchy Directory table and view.

Figure%2016%3A%20Adding%20time-interval%20fields%20to%20the%20Product%20Hierarchy%20Directory

Figure 16: Adding time-interval fields to the Product Hierarchy Directory

Figure%2017%3A%20Configuring%20the%20semantics%20for%20the%20time-intervals%20in%20the%20Hierarchy%20Directory%20View

Figure 17: Configuring the semantics for the time-intervals in the Hierarchy Directory View

After making these changes, we could simply reopen the Analytic Model to use today’s date as the reference date to select active hierarchies. Instead, we add a Reference Date Variable to the Analytic Model, so that we can choose a reference date ourselves. In the below screenshots you can see how we add this date inside the main design screen of the Analytic Model . Then, we open the data preview again, choose January 1, 2024 as the reference date, and see the two hierarchies active at that date when prompted for a hierarchy selection.

Figure%2018%3A%20Adding%20a%20reference%20date%20variable%20to%20the%20Analytic%20Model

Figure 18: Adding a reference date variable to the Analytic Model

Figure%2019%3A%20Choosing%20the%20reference%20date%20when%20previewing%20data%20in%20the%20Analytic%20Model

Figure 19: Choosing the reference date when previewing data in the Analytic Model

Figure%2020%3A%20The%20hierarchy%20selection%20after%20providing%20a%202024%20reference%20date%2C%20shows%20only%20the%20two%20active%20hierarchies%20at%20that%20point%20in%20time

Figure 20: The hierarchy selection after providing a 2024 reference date, shows only the two active hierarchies at that point in time

4. Time-dependency of node assignments

The validity of node assignments works in a similar fashion as for hierarchies themselves. As depicted in the figure below, we’ve extended the Product Hierarchy table and view with a validity interval and added specific dates to the data in the table. We configured the semantics that define the Valid From and Valid To fields as Business Date From and Business Date To , similarly as we did for the directory in Figure 17.

Figure%2021%3A%20Time-interval%20added%20to%20the%20node%20assignments%20in%20the%20Product%20Hierarchy%20Directory%20view

Figure 21: Time-interval added to the node assignments in the Product Hierarchy Directory view

The data illustrates a notable change: the Sports department, initially aligned with the Non-food department until the end of 2023, shifts its association to the Food department starting in 2024. This transition is visible in the data preview for the year 2024, as shown in Figure 22.

Figure%2022%3A%20Data%20preview%20with%20selection%20date%20January%201%2C%202024%2C%20where%20the%20sports%20departments%20changed%20assignment%20from%20the%20Non-food%20to%20Food%20department

Figure 22: Data preview with selection date January 1, 2024, where the sports departments changed assignment from the Non-food to Food department

5. Time-dependency of dimension texts

As mentioned before, time-dependency of dimension texts is supported as well for hierarchies and is defined in the Text view using date intervals, just like we outlined already for the time-dependency of hierarchies and hierarchy nodes. Applying the semantics works in the same fashion, by choosing a field for Business Date From and Business Date To . From the previous instructions, you should be able to build this yourself. A sample of the data, with both time-dependent as language-dependent texts, is displayed in the figure below.

Figure%2023%3A%20Data%20sample%20of%20both%20language-%20and%20time-dependent%20texts

Figure 23: Data sample of both language- and time-dependent texts

In this blog we extended the basic hierarchy with another inner node type, language-dependent texts, and time-dependency for hierarchies and node assignments. With these features, we presented most of the features of the Hierarchies with Directory . Together, these features are aligned with well-known hierarchies from SAP S/4HANA and SAP BW, which makes it now a lot easier to consume such hierarchies in SAP Datasphere.

Figure%2024%3A%20Final%20hierarchy%20with%20different%20node%20types%20and%20texts

Figure 24: Final hierarchy with different node types and texts

Up next: Create an SAP S/4HANA GL Account Hierarchy within SAP Datasphere through Community Content

Assigned Tags

Figure%203%3A%20Adding%20a%20new%20record%20to%20the%20Product%20Hierarchy%20Directory%20table

Insert/edit link

Enter the destination URL

Or link to existing content

IMAGES

  1. Data Structures Assignment Help by top programmers

    data structures for assignment

  2. Data Structures Tutorial

    data structures for assignment

  3. Data Structure Assignment

    data structures for assignment

  4. Seven (7) Essential Data Structures for a Coding Interview and

    data structures for assignment

  5. Datastructures Assignment Help

    data structures for assignment

  6. In-Depth Understanding of Data Structures in Java

    data structures for assignment

VIDEO

  1. DATA STRUCTURES

  2. Lecture 8: Introduction to Data Structure

  3. DATA STRUCTURES

  4. DATA STRUCTURES

  5. DATA STUCTURES CLASS-01 (Introduction)

  6. # Basics of Data Structure #Data Structure

COMMENTS

  1. Data Structures

    Assignments Exams Staff Syllabus, Spring 2024 Welcome to Data Structures, CS112. After completing the course the student will be able to: Analyze runtime efficiency of algorithms related to data structure design. Select appropriate abstract data types for use in a given application.

  2. Data Structures

    Data Structures This course is part of Data Structures and Algorithms Specialization Taught in English 20 languages available Some content may not be translated Instructors: Neil Rhodes +4 more Enroll for Free Starts Jan 14 Financial aid available 263,677 already enrolled Included with • Learn more About Outcomes Modules Recommendations

  3. PDF Lecture Notes for Data Structures and Algorithms

    Chapter 1 Introduction These lecture notes cover the key ideas involved in designing algorithms. We shall see how they depend on the design of suitable data structures, and how some structures and algorithms are more e cient than others for the same task.

  4. Data Structures Tutorial

    A data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently. A data structure is not only used for organizing the data. It is also used for processing, retrieving, and storing data.

  5. 6.006 Introduction to Algorithms, Lecture 2: Data Structures

    6.006 Introduction to Algorithms, Lecture 2: Data Structures. Resource Type: Lecture Notes. pdf. 217 kB 6.006 Introduction to Algorithms, Lecture 2: Data Structures Download File DOWNLOAD. Course Info ... assignment_turned_in Problem Sets with Solutions. grading Exams with Solutions. notes Lecture Notes. Download Course.

  6. Solve Data Structures

    Prepare Data Structures Data Structures Arrays - DS EasyProblem Solving (Basic)Max Score: 10Success Rate: 93.30% Solve Challenge 2D Array - DS EasyProblem Solving (Basic)Max Score: 15Success Rate: 93.16% Solve Challenge Dynamic Array EasyProblem Solving (Basic)Max Score: 15Success Rate: 86.74% Solve Challenge Left Rotation

  7. Unordered Data Structures

    The Unordered Data Structures course covers the data structures and algorithms needed to implement hash tables, disjoint sets and graphs. These fundamental data structures are useful for unordered data. For example, a hash table provides immediate access to data indexed by an arbitrary key value, that could be a number (such as a memory address ...

  8. Learn Data Structures and Algorithms

    DSA Introduction What is an algorithm? Data Structure and Types Why learn algorithms? Asymptotic Notations Master Theorem Divide and Conquer Algorithm Data Structures (I) Stack Queue Types of Queue Circular Queue Priority Queue Deque Data Structures (II) Linked List Linked List Operations Types of Linked List Hash Table Heap Data Structure

  9. PDF Assignment 9: Data structures

    1 Choosing data structures In the lecture you have learnt that di erent data structures can represent di erent kinds of information and support e cient implementation of di erent operations. To do For each of the use cases below pick a data structure (from the ones you have seen in the lecture) that is best suited for that use case, and explain ...

  10. Data Structures and Algorithms Specialization

    Apply algorithmic techniques (greedy algorithms, binary search, dynamic programming, etc.) and data structures (stacks, queues, trees, graphs, etc.) to solve 100 programming challenges that often appear at interviews at high-tech companies. Get an instant feedback on whether your solution is correct. Apply the newly learned algorithms to solve ...

  11. Learn Data Structures and Algorithms

    4. Practice Problems on Data Structures and Algorithms (DSA) What is Data Structure? A data structure is defined as a particular way of storing and organizing data in our devices to use the data efficiently and effectively. The main idea behind using data structures is to minimize the time and space complexities.

  12. Assignments

    Assignments | Advanced Data Structures | Electrical Engineering and Computer Science | MIT OpenCourseWare Assignments Policies There will be a weekly one-page assignment, 10 assignments in total. You may skip any one problem, or we will ignore the problem with the lowest grade.

  13. Data Structure Types, Classifications and Applications

    Courses Practice What is Data Structure: A data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently. A data structure is not only used for organizing the data. It is also used for processing, retrieving, and storing data.

  14. Assignment 4: Stacking Queues

    The fourth assignment is mostly about stacks and queues. For the former you'll build a simple calculator application, for the latter you'll implement the data structure in a way that satisfies certain performance characteristics (in addition to the usual correctness properties). Problem 1: Calculating Stacks (50%)

  15. 4 Must-Know Python Data Structures

    Data structures are an essential part of any programming language. How you store and manage data is one of the key factors for creating efficient programs. Python has 4 built-in data structures: List; Set; Tuple; Dictionary; They all have different features in terms of storing and accessing data.

  16. Learn Algorithms and Data Structures in Python

    Get started. Algorithms and data structures are important for most programmers to understand. We just released a course on the freeCodeCamp YouTube channel that is a beginner-friendly introduction to common data structures (linked lists, stacks, queues, graphs) and algorithms (search, sorting, recursion, dynamic programming) in Python.

  17. Data Structures

    SAS. Data Structures. Syllabus. Lectures. Assignments. Exams. This assignment WILL TAKE A SUBSTANTIAL AMOUNT OF TIME TO COMPLETE. Imagine (just for the sake of this assignment) that you are a student interested in the Rutgers Computer Science program. There are tons of courses available, many with prerequisites that need to be satisfied before ...

  18. TZhoroev/Coursera-Data_Structures_and_Algorithms

    This repository is a compilation of my solutions to the Data Structures and Algorithms assignments offered by the University of California, San Diego (UCSD) and the National Research University Higher School of Economics (HSE) on Coursera. These assignments, covering material from courses 1 through 6, have all been solved using the Python.

  19. data-structures · GitHub Topics · GitHub

    A data structure is a particular way storing and organizing data in a computer for efficient access and modification. Data structures are designed for a specific purpose. ... This repository consists of the code samples, assignments, and notes for the Java data structures & algorithms + interview preparation bootcamp of WeMakeDevs.

  20. PDF Data Structures Introduction

    Data Structures CS 225 Brad Solomon & G Carl Evans August 21, 2023 Introduction. Learning Objectives ... Lab Assignments 120 10 points each Exams 360 60 points each Final Exam 160 All MPs have a one-day late policy for 93% credit There are no extensions for labs. Grading — Final Grades

  21. CSCI 2720: Data Structures

    Data Structures and Algorithms in Java 2nd Edition by Robert Lafore 2003. This book, though dated, offers a solid foundation in data structures and algorithms using Java. ... Make sure you have access to Odin (this is part of our first homework assignment) If you encounter issues with access to Odin, please contact:

  22. Data Structures Assignments

    Data Structures - Assignments In Emertxe each one of our Assignments will ensure you apply the concept you have leant in your classroom. By solving these assignments, you will go through a systematic problem-solving approach which include requirement understanding, algorithm design, pseudocode creation, dry run and final execution.

  23. Graph Search, Shortest Paths, and Data Structures

    There are 4 modules in this course. The primary topics in this part of the specialization are: data structures (heaps, balanced search trees, hash tables, bloom filters), graph primitives (applications of breadth-first and depth-first search, connectivity, shortest paths), and their applications (ranging from deduplication to social network ...

  24. The Top Student Assistance for Data Structure Assignments

    Data structure assignments are crucial for developing proficiency in programming and problem-solving. By understanding the importance of data structures, familiarizing oneself with common ...

  25. PDF Data Structures C++ Review

    Data Structures CS 225 Brad Solomon & G Carl Evans August 23, 2023 C++ Review (Optional) Open Lab This Week ... If you define a destructor, copy, or assignment operator, you should define all three! Implicit default operators are generated otherwise. Memory Management Stack Heap Global. Reference and Dereference

  26. Ykphill/Assignment-1-OOP-and-Java-Classes

    This repository will host all the code the that I create in my Data Structures class. In this class I hope to improve on commenting and general documentation - GitHub - Ykphill/Assignment-1-OOP-and-Java-Classes: This repository will host all the code the that I create in my Data Structures class. In this class I hope to improve on commenting and general documentation

  27. Human Capital Management Structures

    Human Capital Management Structures. An HCM system enables you to set up three integrated structures, the enterprise structure, the personnel structure and the organizational structure. You must be able to evaluate and report on employee data from all enterprise-specific organizational aspects. Every employee is included in the structure of the ...

  28. KARR-seq reveals cellular higher-order RNA structures and RNA-RNA

    KARR-seq data revealed three layers of information: (1) higher-order structures of RSV and VSV RNAs; (2) effects of virus infection on the higher-order structures of the host transcriptome; and (3 ...

  29. Modeling an advanced Hierarchy with Directory in SAP Datasphere

    Update the hierarchy view to include the new node type (see Figure 4). Add a few Product IDs and transactions to the data, to make the data preview richer. Figure 2: Extending the hierarchy table with a Department ID column and new hierarchy entries. Figure 3: Adding a new record to the Product Hierarchy Directory table.