Estimate the memory bounds required by shared variables in software transactional memory programs

19 trang Gia Huy 17/05/2022 1850

Download

Bạn đang xem tài liệu "Estimate the memory bounds required by shared variables in software transactional memory programs", để tải tài liệu gốc về máy bạn click vào nút DOWNLOAD ở trên

Tài liệu đính kèm:

estimate_the_memory_bounds_required_by_shared_variables_in_s.pdf

Nội dung text: Estimate the memory bounds required by shared variables in software transactional memory programs

VNU Journal of Science: Comp. Science & Com. Eng, Vol. 37, No. 2 (2021) xx-xx Original Article Estimate the memory bounds required by shared variables in software transactional memory programs Ngoc-Khai Nguyen1,2 , Anh-Hoang Truong1,*, Duc-Hanh Dang1 1VNU University of Engineering and Technology, 144 Xuan Thuy, Cau Giay, Hanoi, Vietnam 2Hanoi University of Natural Resources and Environment, 41A Phu Dien, North-Tu Liem, Hanoi, Vietnam Received 01 December 2020 Revised 07 June 2021; Accepted 31 July 2021 Abstract: Estimating memory required by complex programs is a well-known research topic. In this work, we build a type system to statically estimate the memory bounds required by shared variables in software transactional memory (STM) programs. This work extends our previous works with additional language features such as explicitly declared shared variables, introduction of primitive types, and allowing loop body to contain any statement, not required to be well-typed as in our previous works. Also, the new type system has better compositionality compared to available type systems. Keywords: Type system, software transactional memory, memory bound. 1. Introduction * lock errors and the management of locks, so they can focus on the business functions of the Usually, the programmers have to take care programs. of the resource usage of the program. However, this task is more difficult when multi-threaded However, there is a disadvantage of the STM programs are written. Synchronizing threads in mechanism that consumes more memory than multi-threaded programs is often handled by the traditional mechanisms. The results of the lock-based mechanism [1]. However, this experiment in [4] have shown that the STM mechanism is often error-prone with lock errors program can consume more resources than such as deadlock or livelock. conventional programs up to 22 times (3 times on average). This is due to when a thread is As an alternative approach for lock-based spawned, it will duplicate shared variables from mechanism, the STM [2, 3] makes it easier for the parent thread so it can use these variables multi-threaded programming since the independently. These variables will be released programmer does not have to pay attention to the when the thread is synchronized with its parent ___ *Corresponding author. E-mail address: hoangta@vnu.edu.vn 1
2 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx thread. Therefore, estimating the memory used statement. This improvement makes the by shared variables in the STM programs is language much stronger but leads to many necessary to optimize the program and reduce challenges and difficulties for typing; (3) The the risks of being out of memory runtime formula to calculate the resource consumption of exceptions. the program and the auxiliary functions such as Due to the implicit synchronization between onacid, new have also been changed to be threads, and nesting of transactions (detailed in more suitable; (4) The semantic rules of Sections 2.1, 3.2), estimating maximum memory language, environment, and status of the for shared variables is a non-trivial problem. The program are redefined in more details. This current work on this problem often uses dynamic makes their implementation easier in practice. methods [4]. That is, they execute the program This improvement makes the language much and use tools to measure the resources consumed stronger but leads to many challenges and by the program. They do this several times, then difficulties for typing; (3) The formula to take the average or maximum value of the calculate the resource consumption of the measurements. This method is easy to do, but it program and the auxiliary functions such as gives only approximate results. The method that onacid, new have also been changed to be we propose in this work is a static method based more suitable; (4) The semantic rules of on type theory. The results are proven language, environment, and status of the mathematically to ensure correctness, so our program are redefined in more details. This results are reliable. makes their implementation easier in practice. In the previous works, we have developed We prove that the bound is sharp for all several type systems to estimate the resource statements except the conditional statement. usage of the STM programs [5-8], in which we Formally, we adapt the type rules defined in assume that the resources are the parameters our previous works [5, 6] to the new STM provided by the user. Users still have to calculate language in order to make the typing more manually the resources of each transaction, so accurate. Several rules such as T-NEW and T- this work is still generalized, not completely BOOL are combined to make them more general. automated. We adjusted the rules T-COND and T-WHILE to This work is extended from our previous type better conditional and loop expressions. The work [6], in which we add more features to the main contributions of this paper are summarized language to make it less abstract than our as follows: previous works and closer to programming • An improved STM language whose features languages available. We offer some basic are closer to current practical language; features of the STM mechanism in practical • A type system to estimate the memory programming languages. Specifically, we bound required by shared variables in the adjusted some of the syntax and semantics as STM programs and discusses the sharpness follows: (1) Shared variables are explicitly of these boundaries. declared. Expressions and values are more The rest of the article is organized as strictly expressed; (2) Expressions in the loop follows: In the next section, we present an body can contain any statement and are not overview of the STM mechanism with an required to be a well-typed expression.In the illustrative example of a research context, and previous language [6], the loop body expression discuss related works. In Section 3, we describe must be a well-typed one and contain no spawn our STM language. Section 4 is about the type
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 3 system. In Section 6, we discuss the sharpness of may not need to write statements to the type system and typing for the example synchronize between threads, their compiler program. Section 7 provides the conclusion and automatically adds those statements during future works. the execution. This is called implicit synchronization. It is convenient for the 2. Preliminary programmer because they do not care about synchronization between threads. However, To describe the problem more clearly, in this this also creates other constraints, we will section, we explain some of the typical features analyze more in detail in section 3.2. of STM programs and illustrate our context Because of this implicit synchronization, we research with an example. We survey related need to analyze the program at the semantic level works on the STM mechanism and the of the language rather than at the source code estimation of resources used by the programs. level. In addition, these programs have some 2.1. Characteristics of STM-based programs nested transactions, threads run in parallel but are not independent. This makes estimating the The STM mechanism is an alternative to memory consumed by these programs really lock-base mechanism to control concurrency complex. Our solution to this problem is to build programs. It has several typical features as a type system. This is a static program analysis follows: method that has been used in many works [9]. • Complex nested transactions: A transaction The main purpose of this type system is to can be spawned within another open estimate the maximum memory required by transaction. When a transaction is spawned shared variables in STM programs. However, within another transaction, we call the first users can develop them to suit their purpose, for transaction a parent transaction, the latter a example, the calculation of the cost of memory, child transaction. Child transactions must be CPU, network bandwidth resources, or the committed before the parent transaction. number of gases required for smart contracts in Within a thread, if a transaction commits Ethereum, etc. without referring to other threads we call local commit. When a transaction commits, 2.2. An illustrative example the threads spawned within it also have to be To describe the problem, we consider an synchronized. We call this joint commit. STM program segment as shown in Listing 1. • Duplicate shared variables: When a new In this code, the statement onacid is used thread is spawned, it copies the shared to open a transaction; The statement commit is variables of the parent thread into their own used to commit a transaction or joint commit variables for use independently. After the transactions between the parent thread and its transaction is completed, these threads child threads. The statement spawn is used to compare the values of these shared variables spawn a new thread; The variables declared with to synchronize. If there is no conflict, they the keyword shared in front are shared are synchronized, otherwise, they are variables between the parent and child threads. canceled or roll-back. This depends on the Under the STM mechanism, these variables design of the programming language. When are cloned by child threads from their parent using this mechanism, the program does not threads to become their own variables; the need to use locks so it avoids lock errors. remaining statements such as declaring and However, copying shared variables of initializing variables, conditional statement, loop threads causes the program to consume more statement are similar to other common memory. languages. • Implicit synchronization: For some actual programming languages, the programmer
4 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx The behavior of this program is described in • At point ①, the maximum memory the Figure 1. The symbol [ describes the required is 12 units as follows: operations that open a transaction; the symbol ] − Thread 0 requires 4 units in which 2 as the closing of a transaction or joint commit units are for the variable x1 in the first between the parent thread and child threads; transaction, another 2 units for the The dashed square (1) represents the joint variable x2 in the 2nd transaction. commits of the parent and child threads; The − Thread 1 requests 8 units in which 2 point marked by the symbol ♦ represents the units are required because thread 1 branch of the conditional statement: they can clones variable x1 from thread 0 by turn to branch e or branch e ; the dashed square 1 2 STM mechanism, 6 units for variables (2) represents the closing of the transaction y1, y2, and y3. outside of the conditional statement. Visually we realize that the maximum memory the program • At point ②, the maximum memory needs to consume by the STM mechanism can required is 10 units as follows: be at points ①, ②, or ③. − Thread 0 requires 4 units for the variables x1 and x2. Listing 1. Example of a multi-threaded − Thread 1 requires 6 or 5 units as follows: program + 2 units for the variable x1, which is cloned from thread 0 by the STM mechanism. + If the statement if turns to branch p1: 4 units for the variables y4, y5, and y6. In this case, at point ② the program consumes 10 units. + If the statement if turns to branch p2: 3 units for variables y7, y8, and y9. In this case, at point ② the program consumes 9 units. • At point ③, the maximum memory required is 8 units as follows: − Thread 0 requires 4 units for the variables x1 and x2. − Thread 1 requires 6 or 4 units: + 2 units for the variable x1 (it is cloned from thread 0). + If the statement if turns to branch p1: 4 units for the variables y4, y5, and y10. In this case, at point ③, the program consumes 10 units. Assuming that a bool variable needs 1 + If the statement if turns to branch p2: memory unit, an integer variable needs 2 2 units for the variables y7 and y10. memory units, the maximum memory required for shared variables in this program at each time is:
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 5 Figure 1. Describe the behavior of the program in Listing. 1. In this case, at point ③ the program the time, memory, and energy needed by the consumes 8 units. program. In [4], Klein et al. found that the energy According to the above analysis, we realize consumption of the software transactional that the maximum memory required by the memory mechanism was higher than the lock- program is 12 units at point ①. based mechanism of up to 22 times (average more than 3 times). 2.3. Related works In [20], the authors introduce the first STM mechanism: The STM mechanism has automatic analysis for deriving bounds on the been studied for a long time, in which [10] can worst-case evaluation costs of parallel first-order be considered as the first proposal of this functional programs. The analysis is performed approach. In this work, there are limits that by a novel type system for amortized resource transactions are static, the data set is already analysis. The main innovation is a technique that known. In work [13], Herlihy et al. have separates the reasoning about sizes of data overcome the limits on the above static structures and evaluation cost within the same transactions by dynamic transactions in libraries framework. In recent years, for the Ethereum of the Java language. Then there are many blockchain platform, in [21], the authors design studies that continue to be developed and and implement a tool to automatically infer implemented in practice, such as in Java sound gas upper bounds for smart contracts. In language [11-14], Haskell language [15], C++ Ethereum, gas (in Ether, a cryptographic language [16]. The above works showed that the currency like Bitcoin) is the execution fee STM mechanism is very potential and practical. compensating the computing resources of The estimation of resources used by the miners for running smart contracts [22]. In [22], program: Estimating resources used by the the authors analyzed the Solidity language and program is an important issue, and it has been of its compilation. Then they propose a tool for interest to many scientists. In [17], the authors automatically locating gas-costly patterns by present an overview of resource estimation analyzing smart contracts’ bytecodes. issues in software engineering. In [18], the Our work focuses on multi-threaded authors present a method to find the upper languages, using STM mechanisms. We focus bounds for the size of permutation codes via on solving the problems of complex nested linear programming. In [19] Jan Hoffmann et al. transactions, implicit synchronization between propose a method of automatic resource bound threads. Our method is static program analysis, analysis for the OCaml program. He used based on the type system. techniques to parameterize resources to deduce
6 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx In works [7, 23], we have built a type system introduced in Section 1. Specifically, we add the for core STM language, which only contains the S-ASSIGN rule, improve the S-WHILE, S-NEW, S- most basic statements. This type system can TRANS rules, and add some auxiliary functions to calculate the maximum number of transactions make the presentation of the rules more detailed. that an STM program requires in the worst case. We explain these improvements in each rule in These works help us make basic rules and prove more detail. the attributes of the type system conveniently. 3.1. Syntax The type system in our work [5] was developed from the type system in our previous works [7, The syntax of the language is presented in 23], in which we have improved the type system Table 1 At line 1., P represents threads or to calculate the maximum resource that the processes, which can be an empty thread, program needs to use rather than just count the denoted by ∅, or multiple parallel threads P||P, maximum number of transactions as in the or a thread p executing expression e. At line 2., previous works. T represents primitive types, which can be the In [5], each transaction is provided a integer or boolean type. D at line 3. is a list of parameter. These parameters represent the variable declarations, which can be the shared resource that the transaction requires. Our type variables or local variables. O at line 4. system calculates the maximum log memory that represents the operators, which can be arithmetic the program needs to use based on those operations such as +, –, ì, ữ (denoted by •); parameters. relational operations such as =, >, ≥, <, ≤ The results of those works are generalized so (denoted by ■); logical operations such as AND, that users can develop them to suit their OR (denoted by ♦), NOT (denoted ▲). purposes, such as calculating the cost of memory At lines 5., 6., vi, vb are integer and logic resources, CPU, network bandwidth, time, or the values respectively. At line 7., v is values which number of gas required for smart contracts in can be integer values or logical values. At line 8., Ethereum. However, the calculation of the ei is an integer expression or an integer value; consumed resource of each transaction (the At line 9., eb is a boolean expression or a logic parameter of the transaction) is manual, so this is value. still semi-automatic. Then, in [24], we have From line 10. to line 14., an expression e can improved the language and type system to be a statement declaring and initializing new calculate the log memory from the program’s variables, assignment, sequencing (denoted by e; shared variables in a completely automated way. e), branch statements, loop statements, the In [6], our type system is developed from our statement spawn{e} to create a new thread that works [5, 8], in which we expanded the executes e, the statement onacid is for opening language, so that it was closer to the actual a new transaction, the commit statement is for language, and we have also improved the type closing a transaction. system for more convenient calculations. We assume that, in each transaction, The main feature of the type system in this variables are declared at the beginning of each work is the ability to composition. This means transaction, and when the program spawns a that it is possible to type any term of the thread in that transaction, they copy all these program, then we combine them to get the type variables into the new thread. These variables of the program. usually copied in batch and store in a memory 3. Transactional Language area called transaction log, or simply log. The size of the logs will be the total memory of the In this section, we present our STM variables that are cloned or initialized within it. language. This language is developed from [6], where we improve some syntax and semantics 3.2. Dynamic semantics rules so that it is closer to the actual language as
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 7 In this section, we present the semantics of Definition 1 (Local environment). A Local the language, define the environments, and environment E is a finite sequence identifier of auxiliary functions that support the the transactions and their sizes, E = {l1:n1; l2:n2; representation of rules. A thread consists of ; lk:nk}. An environment without any elements many transactions, and each transaction contains is called an empty environment, denoted by 휖 [6]. many variables. We call the environment of a Definition 2 (Global environment). A global thread the local environment [6]. The environment of the program is called the global environment Γ is a set of identifiers of threads environment [6], and which is a collection of and their local environments, Γ = {p1:E1; p2:E2; local environments. Transaction size is the total ; pk:Ek} [6]. memory for the variables inside it. We have the The total memory consumed by the program definition of the local environment as follows. at a time is the total memory of the program’s open transactions at that time. Table 1. The syntax of the transactional language In Table 2, we assume some equivalence Definition 3 (Total memory). The total rules: 푃‖푃′ ≡ 푃′‖푃, 푃‖(푃′‖푃′′) ≡ (푃‖푃′)‖푃′′ memory consumed by the program at a time is and 푃‖0 ≡ 푃, and some auxiliary functions as ⟦ ⟧ ∑푛 ⟦ ⟧ Γ, and 훤 = 푖=1 푖 , where n is the number described below. of threads of the program. • S-TRANS: This rule to start a new A pair of an environment Γ and a collection transaction (execute statement onacid). of threads P are called a state Γ, P of the In this rule, the function onacid(l, p, 훤) program. A special state called error describes creates a transaction $l$ with memory size the fault state - the state at which none of the 0 at the end of the local environment of p. transaction rules can be applied. The dynamic If onacid(l,pi, 훤)= 훤', where 훤 = {p1:E1, semantics rules are specified by transition rules , pi:Ei, , pk:Ek} and with statement of the form 훤, 푃 ⟹ 훤′, 푃′ or 훤, 푃 ⟹ 푒 표 as in fresh l then 훤'={p1:E1, , pi:E'i, , Table 2. pk:Ek} where Ei'=Ei;l:0. • S-COMM: This rule is used to commits a transaction. In this rule, ∐1 푖(푒푖) stands
8 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx for 1(푒1)‖ ‖ (푒 ). If the current Function commit(p, Γ) removes the last transaction identifier of p is l, then all transaction in the local environments of threads with transaction identifier l must all threads in p. Suppose intrans(Γ, l:n) joint commit when transaction l commit. = p and commit(p, Γ)= Γ′, for all p':E' ∈ In this rule, function intrans(훤, l:n) Γ′, if p' ∈ p, then p':(E'; l:n) ∈ Γ. For returns a set of all threads inside this other cases p':E' ∈ Γ. transaction l, denoted by p. In the environment 훤 contains transaction l and • S-NEW: This rule is used to initialize a this transaction is the last element of this new shared variable, where function environment. This means that if intrans(훤, new(x,l,p, 훤) initialize a shared variable l:n) = p = p1, , pk then: x at the end of transaction l (the last transaction at that time). If − For all 𝑖 ∈ {1 }, has the form 푖 푛푒푤( , 푙, 𝑖, 훤)=훤', 훤 ={p1:E1, , pi:Ei, ′; 푙: 푛, 푖 , pk:Ek}, and Ei={l1:n1; ; lj:nj} then − For all ′: ′ ∈ 훤 such that ′ ∉ { 1, , } then we have E' does not contain transaction l. Table 2. The semantics of transactional language ′ Γ = { 1: 1, . . . , 푖: ′푖, . . . , : }, and For variables that are declared outside the E'i={l1:n1; ; lj:n'j}, where n'j=nj+m, m is the transactions and local variables, they do not memory size of the initialized variable. affect the memory resources caused by the
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 9 ′ STM mechanism, so in this work, we do not 푛푗 = 푛푗 + 푠𝑖 푒( ) care about these variables. The function size(x) returns the memory • S-SPAWN: This rule is applied to create a size of the variable x. new thread. The statement spawn{ e1 } creates thread p' for executing e1 in • S-COND: This rule to execute the parallel with thread p (it's parent thread) conditional statement. The expression 𝑖 = and the environment 훤 changes to 훤′. 푒 ↓ 푡 푒 ? 1 ∶ 2 means that if eb is true then i get value 1 else i get value 2. The function 푠 푤푛( , ′, 훤) adds to Γ a new thread p'. Its local environment is • S-WHILE: This rule is used to implement copied from the local environment of the the loop in the program. The expression thread. Suppose 훤 = { : } ∪ 훤′′ and e'= 푒 ↓ 푡 푒? 푒; 푤ℎ𝑖푙푒(푒 ){ 푒 }; 푒2 ∶ 푒2 ′ ′ 푠 푤푛( , , Γ) = Γ′ then Γ = Γ ∪ means that if eb is true then e'=e; ′ { : ′} where E'=E. while(eb){ e }; e2 else e'=e2. • S-ASSIGN: This rule is used to assign a In our previous work, expression e in the value to a variable as usual standard body of the loop has to close w.r.t semantics of programming languages. In transaction and does not contain statement this rule, variable x11 and value of spawn. In this work, the body of the loop expression e1 must be of the same type can be any expression. However, in order T. e0 is the value of x11 before it is to determine the resources consumed by assigned the value e1. The function these loop expressions, we are assuming isclone(x11) returns true if the variable that the maximum number of loops is x11 is cloned from another thread (the known. This is also a drawback in this parent thread of the current thread), and work, and we plan to address it in future returns false if the variable x11 is work. initialized in the current thread. • S-SKIP: This rule is used for other If the variable x11 is initialized in the operations, such as •, ■, ♦, ▲ which do not current thread, then when the program affect the transaction and multi-threaded performs the assignment, it is assigned a semantics, so we simplify here by a skip new value (e1) instead of the old value operation. (e0), and the environment 훤 is not changed. • S-ERROR: This rule is used to handle error cases, e.g., commit in an empty If the variable x11 is the cloned variable environment. from the current thread's parent thread, then when performing the assignment, 4. Type System the function write adds a new variable Our type system aims to estimate the upper adjacent to it to store the new value (e1). bound of the maximum memory required by Thus, its old value (e0) is not shared variables in multi-threaded transaction overwritten. This is to serve the joint programs. Each program segment (called a term) commit between the threads. In this is typed through a special string, thereby case, the environment is changed from 훤 abstracting the transaction behavior of that to 훤′ as follows: program. ′′ ′ ′′ ′ 훤 = 훤 ∪ { 푖: 푖}, 훤 = 훤 ∪ { 푖: 푖 }, The type rules presented here are inherited from our previous work [5, 6], however they are = ′′ ∪ {푙 : 푛 }, ′ = ′′ ∪ {푙 : 푛′}, 푖 푖 푗 푗 푖 푖 푖 푗 improved to match the current language and are described in more detail.
10 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx 4.1. Type been opened but no variable is initialized in that transaction. We use a set of symbols with non-negative − numbers to represent the type of a term. The type • 푛: There are n commit statements in of a term is a finite sequence of numbers with succession to finish the previous symbols, which are marked by a pair of a symbol transactions. and a non-negative natural number in the set ℕ+. ơ We use the set { , +, −, ơ, #} to denote the • 푛: There are n threads need to initialization of a variable, open, commit, joint synchronize at a time. # commit a transaction, and memory maximum • 푛: The maximum memory required for a allocation for the logs. The set of numbers with term is n units of memory. symbols is denoted by ℕ, i.e. ℕ = Definition 4 (Type of a term). The type T of a { 푛, +푛, −푛, ơ푛, #푛| 푛 ϵ ℕ}. The numbers term in our system is defined as follows [6]: assigned to these symbols have the following meaning: = 푆 | ‖T ⊗ T | ρ | ⊘ | ‖ The type of a term can be a sequence of • 푛: Initializes a variable with size n, meaning that it needs to be allocated n tagged numbers S as described in Section 4.1, or units of memory to that variable. synthesized from other types of terms. In this definition, TT means that the type of term is + • 푛: The opening of a transaction, the derived from the type of two sequential terms. memory allocated for that transaction is n. ρ means that a term has type T that will be The case n=0 means that a transaction has Table 3. Typing rules The typing rules are described in Table 3., executed in a thread parallel to its parent thread. where the type of a term is of the form푛 ⊢ 푒: , The T ⊗ T, ⊘ , and ‖ operations are and we read that e has type T. n is the merged, choice, and parallel operations, environment of the expression, and it represents respectively, are to create new terms from the amount of memory consumed or released existing terms, and they will be described in when the program executes e. detail in the next section 4.2. • T-ONACID, T-COMMIT: These two rules 4.2. Typing rules are used to type expressions onacid and
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 11 commit corresponding. The statement then{ e1 } else{ e2 }$ then the type of e is onacid to open a transaction, its type is 1 ⊘ 2. + 0. The statement commit to close the − • T-WHILE: This rule is used to type the last opened transaction, its type is 1. loop expression. The symbol Tm to • T-SPAWN: This rule is used to type the describe a sequence of m components T element spawn{ e }. If the type of e is T consecutively, where m is the maximum then spawn{ e } has the type ρ. Here number of loops, and T is the type of the symbol ρ is used to indicate that the loop body expression. By this rule, our current thread is running parallel with the type system can type any loop expressions parent thread. with a known maximum number of loops. For general loop expressions, we cannot • T-NEW: This rule is used to type the statically determine the number of loops, statements that initialize a new reference because it depends on the values of object. The function size(x) returns the variables during program execution. We size of the variable x (memory allocated plan to investigate loop-bound analysis in for variable x). In this work, we assume developing a type inference algorithm for that an integer variable needs 2 memory our type system in a future work. units, a boolean variable needs 1 memory unit. If e is shared int x=0; then type of e • T-THREAD: If an expression e has type T, then the thread executing it also has type is 2; if e is shared bool x=0; then type of e is 1. T. e here is the expression that will be executed by the thread, so its type must be • T-SEQ: This rule represents sequential a canonical sequence as in Definition 5. components of e1 and e2. When two expressions are sequential, its type is also • T-PAR: This rule is used to type programs at the time the program is running. At this sequential. Suppose e1 has type T1 and e2 time, the program has many parallel has type T2, where T1 has a form without ρ threads running. If we just need to type , e1; e2 has type T1T2. static for the program, we do not need this • T-MERGE: This rule to type expressions rule. However, this rule helps us prove the with two parallel elements. The symbol correctness and sharpness of the type ρ 1 indicates that the thread of type T1 is system. running in parallel with its parent thread. ρ For cases where expressions execute outside To simplify the expression of type 1 2 transactions, or it does not consume memory by we use the ⊗ operation shown in the the STM mechanism, we use the rule T-SKIP. following. This means that we can skip the typing of these When e1 ends with some expressions expressions. The function summem(e) returns ρ the total memory that expression e uses by the spawn, it means that e1 has the type . 1 STM mechanism. When its type is associated with T2, we need to use the aggregate operation to type For convenience, we can add or remove 푡 (푠) the string e1; e2 then it will be of the form elements of 0 form from the string because T1⊗T2. it does not affect the semantics of the string. The ̅ • T-COND: This rule is used to type for set ℕ can be divided into equivalence classes, conditional expressions. We assume that in which all elements in the equivalent class describe the same transaction behavior, and each e1 and e2 have types of T1 and T2, class uses the most concise string to describe the respectively. If the expression e is if(eb) class. We call it the canonical string.
12 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx Definition 5 (Canonical sequence). A sequence 4 1 ⇒ (4 + 1) ⇒ 5. S is canonical if tag(S) does not include elements ′ ′, ′ + +′, ′ − −′, ′##′, ′ + −′, ′ + (2) Inside an open transaction, if we − ′, ′ + # − ′, ′ + # − ′, and ⟦푆(𝑖)⟧ > 0 for initialize one more variable with all i. memory to use is m then the memory needed for that transaction increases We can always reduce an S string without by m. changing the way to understand it. Note that, ′ + + 푛 ⇒ (푛 + ) during string reduction, the pattern ′ + − may not appear on the left, but we can add #0 to apply For example, for the expression e is the function. onacid shared int x=0; Note that, during reducing string we cannot shared int y=0; then the type of reduce the string in +푛, + format, because each + + + e is 0 2 2 ⇒ 2 2 ⇒ 4. + + element 푛, represents an open transaction, − it needs to be committed with an element 1. (3) For nested transactions, the memory that they need to use is the total For convenience of presenting operations in memory of the component the following sections, we introduce several transactions. notations and their meanings: • 푠# represents an empty sequence or an # # # element 푛, i.e. 푠 ∈ {휖} ∪ { 푛|푛 ∈ ℕ}. 퐧 − • 푠 (n stands for negation) represents 1 The formula (1) is used to type ơ 퐧 − ơ + or 푛, i.e. 푠 ∈ { 1} ∪ { 푛|푛 ∈ ℕ }. program segments that behave as 퐜 − described in segment AB, formula (2) • 푠 (c stands for commit) represents 1, ơ 0 0 퐜 − is used to type program segments that 푛, or 1‖ ‖ i.e. 푠 ∈ { 1} ∪ behave as described in segment BC in ơ + 0 0 { 푛|푛 ∈ ℕ } ∪ { 1‖ ‖ | ≥ 2}. the Figure 2. • ⟦푠#⟧ represents the natural number of s corresponding, for example, ⟦5#⟧ is equal to 5. The following are some rules to reduce a sequence of tagged numbers to a canonical sequence. Figure 2. Nested transactions. (4) Two consecutive terms have types (1) In a transaction, memory resources # # 푛, , then their type is the type of allocated to consecutive variables are term that has a greater value, so we equal to the total memory allocated for have the formula to reduce the string as each variable. follows. 푛 ⇒ (푛 + ) # # # 푛 ⇒ ( , 푛) For example, for the expression e is In this case, the behavior of the shared int x=0; shared program segment is similar to that bool y=0; where an integer variable described in the Figure 3. needs 4 units of memory, the logical variable needs 1 unit of memory then the type of e is
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 13 but T2 does not then we do not merge them and wait for the next expression. Figure 3. Two consecutive terms. (7) The ⊘ operation is used to type (5) Symbol ρ is used to mark that the conditional statements. In this case, we expression is executed in parallel to its will choose the element which has a parent thread. larger value. ρ ( ρ )ρ = In case the expression does not contain joint commits, we can ignore them: # ρ # ( 푛) = 푛 ρ For expression of the form 1 2, then T is the remainder of the parent thread 2 In this rule, for brevity, we use the after spawning the child thread symbol ∓푛 instead of the symbol +푛 or executing T1. Since T2 has joint −푛 or ơ푛. commits with T1, we can join T1 with T2 to make it ready for joint commit. (8) The following rules are used for threads ρ to joint commit. The value n in element 1 2 ⇒ 1 ⊗ 2 ơ 푛 represents the number of threads The ⊗ operation is defined rule (6). In inside the latest opened transaction. This this case, the behavior of the program case is described in the Figure 5., and we segment can be described in the Figure can combine them with the following 4. rules. The following rule is similar to the above, but we are interested in nested transactions, the same as described in the Figure 4. Two parallel threads. Figure 6. (6) The ⊗ operation is used to combine the type of threads in parallel. Figure 5. Joint commit parallel threads (case 1). During this operation, we look for joint commits from left to right to merge them. In case T1 contains joint commits Figure 6. Joint commit parallel threads (case 2).
14 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx The above rules are used to type the To explain more clearly these two rules, we program when it is not running (static). consider the example in Listing 2. If your purpose is only to determine the We apply the rules T-ONACID, T-COMMIT, type of program, the above rules are T-INIT, T-SPAWN to type the program. We then sufficient. However, to prove the apply rule (9) to reduce them, and get the type of correctness and sharpness of the type the program is system, we need the following two rules. The two rules (9), (10) are to type the program when the program is running (dynamic), and already have threads After the spawn statement runs, the first running in parallel. thread has an open transaction, and their memory (9) The following rules are used for threads is duplicated. The type of the rest of the two # − # − to joint commit between threads in threads are thread 1: 2 1; thread 2: 2 1. running time. Apply rule T-PAR, and apply rule (9) to Listing 2. Example of applying rules (9), (10) reduce them, we have: This is the type of program at the time after executing the statement spawn. Besides, it has an open transaction and its memory is being duplicated. So the type of this transaction is: + + 1 1. We add this sequence to the above type sequence, and apply rule 10. to reduce them, we have: # So the program type is 6, or the maximum memory the program needs is 6 units. Since # # where k>0 and 1푠1 푠1 , 2푠2푠2 is the type in this work reflects the behavior of canonical forms. a term of a program, so the type of a well- type program is a string containing only one (10) Similar to rule (8)., the rules below apply # at the time that the program is running. element 푛, where n is the maximum memory that the program requires during executing it. Definition 6 (Well-typed). A program is well- typed if it has type T and ⇒∗ 푠# [6]. Definition 7 (Resource consumption). If 훤, 푃 is a running state of the program and
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 15 푃: , then the maximum resource • If 훤, 푃 ⟹ 훤′, 푃′, assume that ⟦훤, 푃⟧ ≤ consumption during executing P is: ⟦푠#⟧, by Lemma 1, we have ⟦훤′, 푃′⟧ ≤ ⟦훤, 푃⟧ ≤ ⟦푠#⟧. ∎ The correctness of the type system is understood that a well-typed program does not use more memory than the amount expressed in its type. 4.3. Characteristic of the type system Theorem 2 (Correctness). Given a well-typed In this section, we present the characteristics ∗ # program P0, its type is T, and ⇒ 푠 then the of type systems and apply them to prove the correctness and sharpness of type systems. resource consumption of the program during # A type of the expression e has the running cannot exceed ⟧ ≤ ⟦푠 ⟧ [6]. characteristic that its environment is only Proof. Let 훤, 푃 be a state of the program, by the sufficient for open transactions to commit in e as Lemma 2, we have ⟦훤, 푃⟧ ≤ ⟦푠#⟧. By Definition described by its type. We have the following ⟦훤, 푃⟧ and Theorem 1 we infer ⟦훤⟧ ≤ ⟦푠#⟧. ∎ theorem. This theorem asserts that, if a program is Theorem 1 (Type judgment property). Assume well-typed then maximum memory usage of the e:T, and its environment is program will not exceed the value expressed in 훤, if 푆 ⇒∗ 푠# then ⟦푠#⟧ ≥ ⟦푆⟧, where 푆 = its type. +푛 +푛 +푛 , and ⟦푆⟧ = ⟦훤⟧ [6]. 1 2 5. Typing the example program Proof (Sketch). By induction on the typing rules In this section, we apply the rules in Table 3. in Table 3. ∎ to build a type inference tree for the example During program execution by the semantic program in List 1. rules in the Table 2, if the program changes from In this work, our type system can type any state 훤, 푃 to 훤′, 푃′ and ⟦훤, 푃⟧ = 푛, then term, and then integrate them into the program's ⟦훤′, 푃′⟧ = 푛′ and 푛 ≥ 푛′ for any rules. We type. However, in order to facilitate the analysis formally express this characteristic in the of sharpness in Section 6.1, we divide it into two following theorem. steps as follows: First, we type the program 10 segment 푒29 (the program segment contains the Lemma 1 (subject reduction). If 훤, 푃 ⟹ 훤′, 푃′ conditional statement). Then we combine it with by R rule and ⟦훤, 푃⟧ = 푛 then ⟦훤′, 푃′⟧ = 푛′ and the rest of the program to get the type of the 푛 ≥ 푛′ for ∀ R [6]. program. Proof (Sketch). The proof is done by checking By applying the rules in Table 3, we can one by one of all the semantics rules in Table 2. build a type inference tree for the program 10 ∎ segment 푒29 as shown in the Figure 7. We inherit this result, and do the same for the Lemma 2 (Preservation). Given a well-typed P0, rest of the program, we get the type inference its type is T and ⇒∗ 푠#. For any state 훤, 푃 of tree of the program as shown in the Figure 8. the program, we have ⟦훤′, 푃′⟧ ≤ ⟦푠#⟧ [6]. Note that the variables a and b at line 1 are local variables, so we omit it. Proof. By inductions on transitions of the Through this type inference tree, we can semantics. conclude that, in the worst case scenario, the • Initial state: ⟦∅, 푃⟧ = ⟦푠#⟧ ≤ ⟦푠#⟧. program can use up to 12 memory units for shared variables.
16 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx 6. Discussion total memory that the program needs to use (i.e. bound is not sharp). In this section, to explain more clearly about During proving the correctness and our work, we discuss sharpness, and evaluations sharpness of the type system, we realized that, of our proposed solution. for expressions that contain conditional 6.1. Sharpness statements, the memory bound found may not guarantee sharpness. To better understand this The sharpness of type system is understood 10 that if given a well-typed program then there is problem, we consider the program segment 푒29 always a path from the initial state 훤 , 푃 to the in the example in List 1. Applying the rules in 0 0 Table 3., we build the type inference tree for the state 훤, 푃 such that the memory consumption of 10 the program at state 훤, 푃 to be equal to the value program segment 푒29 in the Figure 7. shown in their type expression. Through the type inference tree, we realized 10 # In this work, our typing rules ensure that the that the type of 푒29 is 5. This means that it is a memory bound is always greater than or equal to well-typed program segment and the maximum the total memory required by the program (i.e., memory consumed by it is 5 units. ensuring the correctness of the type system). Now, we analyze the program segment However, there are some instances where the through the Figure 1. and code in Listing 1., we bound found by these rules is greater than the have the following cases: Figure 7. Typing the program segment at lines 10-29 in Listing. 1. • If > , the conditional statement will that the program segment needs to use is 4 11 19 10 execute 푒16 , not 푒24, so 푒29 can rewrite units. 11 26 into 푒16 ; 푒29 , and the maximum memory
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 17 • If ≤ , the conditional statement will In this work, our proposed method is a static 19 11 10 execute 푒24, not 푒16 , so 푒29 can rewrite estimation method, based on type theory, which 19 26 can be proved mathematically to ensure into 푒24; 푒29 , and the maximum memory that the program segment needs to use is 2 correctness, so the results are reliable. units. A special feature of our type system is that it We realize that the program segment has no can give type to an uncompleted program, or a path to use up to 5 units (in other words, in this program segment. This feature is very useful, case, the bound is not sharp). because it can give a preview of memory usage Thus, from the above example, we realize patterns while the programmers are typing code. that, if we just based on the expressions in the We tried to find studies close to our work to conditional statement, we can not find the compare results, but we couldn't find any, so memory bound correctly because it depends on comparing our results with other studies has yet the expression before and after conditional to be done. statement. In this work, we use an abstract language Currently, we have done a review of the with the aim to focus on analyzing the behavior cases that we recognize, and we have found that of copying shared variables of the STM the memory bound found by the type system is mechanism. For future work, we plan to apply to sharp for all rules, except for the branching real languages and compare with actual memory statement. bounds. 6.2. Evaluation of our solution Figure 8. Typing the example program in Listing 1.
18 N.N Khai et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 2 (2021) xx-xx 7. Conclusions transactional programs, in: Proceedings of the Eighth International Symposium on Information We present a multi-thread language based on and Communication Technology, SoICT 2017, the STM mechanism and a type system for ACM, New York, NY, USA, 2017, pp. 409–416. estimating the maximum memory for its doi:10.1145/3155133.3155183. programs. In this work, the language and type URL system are more detailed and rigorous than our [7] A. Truong, D. V. Hung, D. Dang, X. Vu, A previous work, so they are closer to reality. type system for counting logs of multi-threaded We added some discussion about the nested transactional programs, in: N. Bjứrner, S. sharpness of the memory bound found by the Prasad, L. Parida (Eds.), Distributed Computing and Internet Technology - 12th International type system, and evaluations of our proposed Conference, ICDCIT 2016, Proceedings, solution. This helps users better understand our Vol.9581 of LNCS, Springer, 2016, pp. 157–168. solution and apply it more effectively. doi:10.1007/978-3-319-28034-9\_21. In our future work, we plan to solve the URL problems of sharpness of the found memory [8] X. Vu, T. M. T. Tran, A. Truong, M. Steffen, A bound and general loop typing. In this work, our type system for finding upper resource bounds of language is still in abstract form to focus on multi-threaded programs with nested transactions, presenting the features of the STM mechanism. in: Symposium on Information and Communication We will apply these results to the actual language Technology 2012, SoICT ’12, Halong City, Quang in future work. Ninh, Viet Nam, August 23-24, 2012, 2012, pp.21– 30. doi:10.1145/2350716.2350722. URL References [9] J. C. Mitchell, Type systems for programming languages, in: Handbook of Theoretical Computer [1] H. Weiwu, S. Weisong, T. Zhimin, L. Ming, A lock- Science, Volume B: Formal Models and Sematics, based cache coherence protocol for scope 1990. consistency, Journal of Computer Science and [10] N. Shavit, D. Touitou, Software transactional Technologydoi: memory, in: Symposium on Principles of Distributed Computing, 1995, pp. 204–213. [2] J. R. Larus, R. Rajwar, Transactional memory, doi:10.1145/224964.224987. Commun. ACM 51 (2006) 80–88. [11] B. Carlstrom, J. Chung, H. Chafi, A. McDonald, [3] T. Harris, J. Larus, R. Rajwar, Transactional Memory, C. Minh, L. Hammond, C. Kozyrakis, K. Olukotun, 2Nd Edition, 2nd Edition, Morgan and Claypool Executing java programs with transactional memory, Publishers, 2010. Science of Computer Programming 63 (2006) 111– [4] F. Klein, A. Baldassin, J. Moreira, P. Centoducatte, S. 129. doi:10.1016/j.scico.2006.05.006. Rigo, R. Azevedo, Stm versus lock-based systems: [12] T. Harris, K. Fraser, Language support for lightweight An energy consumption perspective, in: Proceedings transactions, SIGPLAN Not. 49 (4) (2014) 64–78. of the 16th ACM/IEEE International Symposium on doi:10.1145/2641638.2641654. Low Power Electronics and Design, ISLPED ’10, URL ACM, New York, NY, USA, 2010, pp. 431–436. [13] A. Welc, S. Jagannathan, A. L. Hosking, doi:10.1145/1840845.1840940. Transactional monitors for concurrent objects, in: M. URL Odersky (Ed.), ECOOP 2004 – Object-Oriented [5] A.-H. Truong, N.-K. Nguyen, D. V. Hung, D.-H. Programming, Springer Berlin Heidelberg, Berlin, Dang, Calculating statically maximum log memory Heidelberg, 2004, pp. 518–541. used by multi-threaded transactional programs, in: [14] J. Vitek, S. Jagannathan, A. Welc, A. L. Hosking, A Theoretical Aspects of Computing - ICTAC 2016 - semantic framework for designer transactions, in: 13th International Colloquium, Taipei, Taiwan, D. Schmidt (Ed.), Programming Languages and ROC, 2016, Proceedings, Lecture Notes in Computer Systems, Springer Berlin Heidelberg, Berlin, Science, Springer, 2015, pp. 3–27 (to appear). Heidelberg, 2004, pp. 249–263. [6] N.-K. Nguyen, A.-H. Truong, A compositional type [15] T. Harris, S. Marlow, S. Peyton-Jones, M. Herlihy, systems for finding log memory bounds of Composable memory transactions, in: Proceedings of
P.N. Son et al. / VNU Journal of Science: Comp. Science & Com. Eng., Vol. 37, No. 1 (2020) 1-19 19 the Tenth ACM SIGPLAN Symposium on Principles analysis for parallel programs, in: J. Vitek (Ed.), and Practice of Parallel Programming, PPoPP ’05, Programming Languages and Systems - 24th ACM, New York, NY, USA, 2005, pp. 48–60. European Symposium on Programming, ESOP 2015, doi:10.1145/1065944.1065952. Vol. 9032 of LNCS, Springer, 2015, pp. 132–157. URL doi:10.1007/978-3-662-46669-8-6. [16] B. Dongol, R. Jagadeesan, J. Riely, Transactions in URL relaxed memory architectures, Proc. ACM Program. [21] E. Albert, P. Gordillo, A. Rubio, I. Sergey, Lang. 2 (POPL) (2017) 18:1–18:29. Gastap: A gas analyzer for smart contracts, ArXiv doi:10.1145/3158106. abs/1811.10403. URL [22] T. Chen, X. Li, X. Luo, X. Zhang, Under-optimized [17] L. Briand, I. Wieczorek, Resource Estimation in smart contracts devour your money, 2017 IEEE Software Engineering, 2002. 24th International Conference on Software Analysis, doi:10.1002/0471028959.sof282. Evolution and Reengineering (SANER) (2017) 442– [18] M. Bogaerts, New upper bounds for the size of 446. permutation codes via linear programming, Electr. J. [23] T. A. Hoang, N. N. Khai, A type system for counting Comb. 17. logs of a minimal language with multithreaded and [19] J. Hoffmann, A. Das, S.-C. Weng, Towards nested transactions, Journal of Science of Hnue. automatic resource bound analysis for ocaml, [24] N. N. Khai, T. A. Hoang, A type system for inferring SIGPLAN Not. 52 (1) (2017) 359–373. the log memory of transactional program from shared doi:10.1145/3093333.3009842. variables., Journal of Science and Technology section URL on Information and Communication Technology. [20] J. Hoffmann, Z. Shao, Automatic static cost