The first programming language and freshman year in computer science: characterization and tips for better decision making

. The ability to program is the “visible” competency to acquire in an introductory unit in computer science. However, before a student is able to write a program, he needs to understand the problem: before formalizing, the student must have to (be able) to think, (be able) to solve and (be able) to define. At an early stage of learning there are no significant differences between programming languages. The discussion of the early programming language continues: probably never will be a consensus among academics. The Association for Computing Machinery (ACM) and Institute of Electrical and Electronics Engineers (IEEE) computer science curriculum recommendations haven't clearly defined which programming language to adopt: it is the course directors and teachers who must make this choice, consciously and not only following the trends. This article presents a set of items that should be considered when you make a programming language choice for the first programming unit in higher education computer science courses.


Introduction
Programmability is the "visible" skill to be acquired in an introductory unit in computer science. Programming can be considered an art [1], a science [2], a discipline [3] or even the science of abstraction [4]. However, using a programming language is no more than a method for the programmer can communicate instructions to the computer. Before designing a program, the student must know the problem, know how to use the necessary tools to solve the problem that need to be solved with the machine, such as methods used to specify specifications and rigorous solutions that can be implemented on the computer. For this, the student will also have to learn one or more programming languages and paradigms in order to use the programming notions and systematize the use of data structures and algorithms to solve different categories of problems [5].
Students often have the perception that the focus is on learning the syntax of the programming language, leading them to focus on implementation activities rather than activities such as planning, drawing, or testing [6].
The art of programming involves four steps [7]: a) To Think: the conceptualization and analysis phase in which problems are divided into small and easily intelligible processes or tasks, the modular structure, whose organization must follow a descending programming logic, Top-Down [8]; b) To Solve: Translate Top-Down into Algorithm [1], which incorporates solution rules using pseudo code; c) To Define: Using variables and data structures, characterize the data model to be used in the algorithm; d) To Formalize: translate the algorithm into a programming language, its implementation and execution on the computer.
Then it comes the most important phase, the true moment of truth: Does the program run, is error-free and give the correct result? And how are you sure that the result is "the" or "probably" the correct solution?
The following table shows how each of ten of the most well-known programming languages write the famous "Hello, World!" Table 1. "Hello World!" ten different programming languages.
Some say that programming is very difficult [9] [10] while for others it may be easy [11]. Success is achieved through a good deal of study, research, planning, persistence and preferably a passion for the activity.
This article is divided into five parts: this introduction, the second part with Programming languages: concept and characterization; the third part with Evolution of programming languages in undergraduate computer science studies; the fourth part with Choosing the Initial Programming Language and the last part with conclusions and future work.

Programming languages: concept and characterization
A programming language is a system that allows the interaction between man and the machine, being "understood" by both. It is a formal language that specifies a set of instructions and rules. Programming languages are the medium of expression in the art of computer programming. Program writing must be succinct and clear, because programs are meant to be included, modified, and maintained throughout life: a good programming language should help others to read programs and to understand how they work. [12]. A program is a set of instructions that make up a solution after being coded in a programming language. [13].
There are several reasons why thousands of high-level programming languages exist and new ones continue to emerge [14]: Evolution: The late 1960s and early 1970s saw a revolution in "structured programming," in which the GoTo-based flow control of languages such as FORTRAN, COBOL, and Basic gave way to while loops, case statements (switch). In the late 1980s, Algol, Pascal and Ada began to give way to the object-oriented languages like Smalltalk, C ++ and Eiffel. And so on.
-Special Purposes: Some programming languages are designed for specific purposes. C is good for low level system programming. Prolog is good for reasoning about logical relationships between data. Each can be successfully used for a wide range of tasks, but the emphasis is clearly on the specialty.
-Personal preference: Different people like different things. Some people love C while others hate it, for example.
According to Stack Overflow Annual Developer Survey [15], with over 90,000 answers to over 170 countries, by 2019 the most widely used programming language is JavaScript (Table ). In September 2019, the TIOBE Programming Community index [16], Indicator of the popularity of programming languages featured Java as the most popular (Table 3), followed by C and Python. The technology of the most searched electronic sites (Table 4) according to Wikipedia 1 (Wikipedia, 2019) is also varied in the use of back-end languages; However, JavaScript is almost always used on the front end.

Evolution of programming languages in undergraduate computer science studies
Computer science became a recognized academic field in October 1962 with the creation of Purdue University's first department. [18]. The first curriculum studies appeared in March 1968, when the Association for Computing Machinery (ACM) published an innovative and necessary document, Curriculum 68: Recommendations for academic programs in computer science [19], with early indications of curriculum models for programs in computer science and computer engineering. Prerequisites, descriptions, detailed sketches, and annotated bibliographies were included for each of these courses. As initial unit, it presented B1. Introduction to Computing (2-2-3) 2 in which an algorithmic language was proposed, recommending that only one language be used or two "in order to demonstrate the wide diversity of the computer languages available"; "Because of its elegance and novelty, SNOBOL can be used quite effectively for this purpose." With the emergence of many new courses and departments, ACM published a new report, Curriculum'78: recommendations for the undergraduate program in computer science [20], updating Curriculum 68. It presented for the first time the denomination CS1: Computer Programming I (2-2-3): "The emphasis of the course is on the techniques of algorithm development and programming with style. Neither esoteric features of a programming language nor other aspects of computers should be allowed to interfere with that purpose." Despite the importance of Curriculum'78 there has been much discussion, particularly regarding the sequence CS1 and CS2. In 1984 a new report is published: "Recommended curriculum for CS1, 1984" [21] to detail a first computer science course that emphasizes programming methodology and problem solving." This report refers Pascal, PL / 1 e Ada: "These features are important for many reasons. For example, a student cannot reasonably practice procedural and data abstraction without using a programming language that supports a wide variety of structured control features and data structures". They said that "Although FORTRAN and BASIC are widely used, we do not regard either of these languages as suitable for CS1" and ALGOL "does satisfy the requirements but is omitted from our list of recommended languages simply because it is no longer widely used or supported." In 1991 [22] IEEE (Institute of Electrical and Electronics Engineers) and ACM joined for a new document. This document emerged by breaking with some of the concepts of previous documents, presenting a set of individual knowledge units corresponding to a topic that should be addressed at some point in the undergraduate program. In this way, institutions have considerable flexibility in setting up course structures that meet their particular needs.
In 2001 a new document was published [23]. This document questioned the programming-first of previous documents, as early programming approaches may lead students to believe that writing a program is the only viable approach to solving problems using a computer and that focus only on programming reinforces the common misperception that "computer science" equals programming. They said "In fact, the problems of the programming-first approach can be exacerbated in the objects-first model because many of the languages used for object-oriented programming in industry-particularly C++, but to a certain extent Java as well-are significantly more complex than classical languages. Unless instructors take special care to introduce the material in a way that limits this complexity, such details can easily overwhelm introductory students".
In 2008 a new report is presented: "Computer Science Curriculum 2008: An Interim Revision of CS 2001" [24]; security is strongly mentioned, making minor revisions to the 2001 document. Curriculum'2008 reinforces the idea that "Computer science professionals frequently use different programming languages for different purposes and must be able to learn new languages over their careers as the field evolves. As a result, students must recognize the benefits of learning and applying new programming languages. It is also important for students to recognize that the choice of programming paradigm can significantly influence the way one thinks about problems and expresses solutions of these problems. To this end, we believe that all students must learn to program in more than one paradigm".
When referring to languages and paradigms, the "Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science" [25], says that the choice of programming languages seems to be depend on the chosen paradigm and "There does, however, appear to be a growing trend toward "safer" or more managed languages (for example, moving from C to Java) as well as the use of more dynamic languages, such as Python or JavaScript." "Visual programming languages, such as Alice and Scratch, have also become popular choices to provide a "syntax-light" introduction to programming; these are often (although not exclusively) used with non-majors or at the start of an introductory course". And "some introductory course sequences choose to provide a presentation of alternative programming paradigms, such as scripting vs. procedural programming or functional vs. object-oriented programming, to give students a greater appreciation of the diverse perspectives in programming, to avoid language-feature fixation, and to disabuse them of the notion that there is a single "correct" or "best" programming language". It is clear that curriculum recommendations do not indicate which programming language to adopt. However, it is always said that they should have the simplest possible usability and syntax for better learning. Language choice has always been a matter of concern to educators [26] [27] [28] [29] [30].
FORTRAN was selected as a high level language for the first introductory courses; especially those linked to engineering departments. The less widely used COBOL was adopted by departments that were more closely linked to information systems. [31]. At that time you couldn't talk about methodology: everything was just programming.
With the emergence of BASIC in 1964 [32] has led some departments to use this language for introductory students. In 1972 almost all computer science degree programs used ALGOL, FORTRAN or LISP, while most data processing programs used COBOL. In Britain, BASIC was also important. In the late 60's, some departments tried various languages like PL / I [33].
With Dijkstra's manifest [34] structured programming begins to be discussed [35] [36]. With the emergence of the Pascal language [37] seems to become almost consensual [31]: an almost written language for the purpose of programming learning, using a very friendly development environment [38], and obviously because of the proliferation of personal computers and the availability of Pascal compilers [39].
Pascal's decline began in the late 1980s, early 1990s, with object-oriented programming. And also because Pascal has a difficult document reuse, but also because Pascal is not a "real world" language [39]. McCauley e Manaris [40] They say that as a first language Pascal was used by 36% and C ++ by 32% in 1995-1996 but 22% intended to make a switch to C ++, C, Ada or Java. There are several studies that present the evolution of the languages adopted in initial programming curricular units [41] [42] and even lists of programming languages taught in various courses [43].
In Portugal [44], in the 2016-2017 school year, the most common first-year programming language sequence in 46 courses analyzed was C (48%), followed by Java (22%), C and Haskell (9%), C and Java (4%), Scheme and Java (4%). There were also residual sequences Excel and C, Python, Python, HTML and Java, Python and Java, Schem and C ++ and XML and Java. Regarding the ten Portuguese first cycle (or with integrated master's degree) courses in Computer Engineering considered most significant [45], it was found that the most common sequences were only Java or Python and C (both with 30%), C (20%), Python and Java or Haskell and C (both with 10%).
According to the document "An Analysis of Introductory Programming Courses at UK Universities" [46]: -73.8% use only one programming language; 21% reported using two.
-The most widely used language is Java (46%), followed by the "C family" (C, C ++ and C #) (23.6%) and Python (13.2%). Javascript and Haskell are much less adopted.
-The reason given by 82.7% of those who uses Java, was to be object oriented, while 72.7% of those using Python refer to the pedagogical benefits.
According to the document "Introductory Programming Courses in Australasia in 2016" [46] referring to the Universities of Australia and New Zealand: -48 courses studied: 15 used Java, 15 Python, 8 C, 5 C #, 2 Visual Basic and 2 Processing. The remaining ten use another programming language.
"What language? -The choice of an introductory programming language" [47], A study with 496 fouryear courses in the United States, refere that Java is used by 41.94%, Python 26.45%, C ++ 19.35%, C 4.52%, C # 0.65% and 7.10% by another. The reasons for choosing were: Programming language features 26.19%, Ease of learning 18.81%, Job opportunities for students 14.76%, Popularity at the academy 13.10%, Institutional tradition 8.57%, choice of advisory board 5.95%, availability of teachers or scheduling restrictions 5%.
A 2016 study [48] analyse 218 colleges and 143 universities in 35 European countries, indicating that the most commonly used programming language was C (30.6%), following C ++ (21.9%) and Java (20.7%) A document [10] for 152 CS1 units from a number of different countries concludes that Java is by far the most common CS1 language, used in 74 (49%) of the 152 programs. The second most frequent is Python, with 36 (24%). C ++ comes in 30 (20%) followed by C in 8 (5%), with the most obvious change being the rise of Python which "probably occurred at the expense of Java and C++".
Today, with few exceptions, the academy follows the "real world" and the "C family" (C, C ++, C #), Python, Java, and JavaScript are undoubtedly the programming languages adopted in introductory programming units.

Choosing the Initial Programming Language
In 2004, Eric Roberts [49] commented that the languages, paradigms, and tools used to teach computer science became increasingly complex; which pressures to cover more material in an already overcrowded area. The problem of complexity is exacerbated by the fact that languages and tools change rapidly, leading to profound instability in the way computer science is taught. Roberts predicted that Java would be the way "we must take responsibility for breaking this cycle of rapid obsolescence by developing a stable and effective collection of Java-based learning applications that meet the needs of the science education community".
Dijkstra [50] wrote about the importance of the chosen programming language: "the tools we are trying to use and the language or notation we are using to express or record our thoughts, are the major factors determining what we can think or express at all! The analysis of the influence that programming languages have on the thinking habits of its users, and the recognition that, by now, brainpower is by far our scarcest resource, they together give us a new collection of yardsticks for comparing the relative merits of various programming languages. " When selecting the first programming language for introductory programming courses, it is important to consider whether it is suitable for teaching and learning. Over time various pseudo-code languages have been created in search of the perfect teaching language but no definitive solution has been found [51].
In document "Introductory Programming Subject in European Higher Education" [48] discusses the need to teach introductory programming using educational programming languages. But in the past these languages have been discontinued: the Pascal language being the most visible.
The programming language chosen for introductory programming courses often seems like a religious or football issue. In reflection-teaser "The Programming Language Wars" [52] it is even said that "Programming language wars are a major social problem causing serious problems in our discipline" leading to "massively duplicating efforts" and "reinventing the wheel constantly." Choosing the best programming language is often an emotional issue, leading to major debates [53] but for Guerreiro [54] "It is up to us to have an open, exploratory attitude and at the same time not dogmatically accept what those who make the most noise say. In fact, I think we should even pass this on to students too, to help them develop their critical thinking, and to be able, sooner or later, to choose the languages and tools that can best respond to their needs".
In fact, two of the most important points are pedagogical issues and student preparation for the world of work. Parker e Devey [33] define them as pragmatic and pedagogical: industry acceptance, market penetration as well as the employability of graduates.
Keep in mind that "small programming" needs to be mastered before "large programming" [55] since traditionally only "in the third or fourth year are faced with the problems that arise in the design of large programs." Collberg [55] said that the task of choosing the initial language is not an easy task. It must obey factors such as simplicity, expressiveness, suitability for tasks, availability of accessible resources, and reliable compilers.
Programming languages are the fundamental basis of programming, but trends change dramatically over time. Professionals will not use the same programming language, or even the same programming model, for their entire professional career. In addition, well-informed language choices can make a huge difference in programmer productivity and program quality. Therefore, it is crucial that students master the essential concepts of programming languages, so that they can choose and use languages based on a deep understanding of the abstractions they express and their ability to solve programming problems. [56].
Choosing the initial programming language to adopt should take into account several points: Course objectives, Teacher preferences, available implementations, and relationships with other course units, as well as the "real world": students are often more motivated to study a familiar language that is known to be requested by employers [57].
Howatt [58] uses an evaluation method for programming languages using several items: language design and implementation (accuracy and speed), human factors (usability and ease), software engineering (portability, reliability and reuse) and application mastery specific applications).
The paradigm chosen can be very important. [59] unless one adopts "exposing students to all major paradigms through the use of a multiparadigmatic language, and does not attempt to identify" the "correct paradigm" [60].
The document "A Formal Language Selection Process" [61] has a design of choice with a weighted multicriteria method and where evaluation criteria are identified such as Reasonable Financial Cost, Academic / Student Version Availability, Academic Acceptance, Textbook Availability, Lifecycle Stadium, Industry Acceptance, Marketing (regional and national), Student / Academic / Full System Requirements, Operating System Dependency, Proprietary / Open Source, Development Environment, Debugging Facilities, Fundamentals Learning Ease, Secure Code, Advanced Course Features Subsequent, More or Less Complicated Programming, Web Development Support, Teaching Support, Object Oriented Support, Support Availability, Instructor and Staff Teaching, and Expected Level of New Students.
Mannilla e de Raadt [30] compare multiple languages in which various inclusion / exclusion criteria are used such as Be suitable for teaching, Be interactive and fast, Promotes correct writing, Allows you to program in "small", Provides a continuous development environment, Good user community, Open source, good support, be free, have good teaching material, not only be used for educational purposes only, be reliable and efficient.
Several attempts have been made in the past to sort programming languages [41].
There are numerous comparisons between the most commonly used languages: like Python vs C ++ [62], Python vs C [63], Java vs Python [64], C++ vs. Java [65]. Any of the three / four most commonly used programming languages is free, well supported and has a large user community, is reliable and efficient.
Ease of learning can be discussed: C will have a more complicated syntax than Pyhton. The major differences are the use of pointers (C only), parameter passing by reference and value (C only), programming paradigm (procedural in C, object oriented in others), being compiled or interpreted (C and Pyhton / Java respectively).

Conclusions
A programming language is used to materialize the solution of a problem. A program should only be written after finding the best solution.
There are numerous programming languages that are adopted for the sake of evolution, purpose of use or even personal taste.
The choice of which programming language to choose for introductory teaching must accompany evolution, but because it has a propaedeutic character, the choice must meet several requirements, namely pedagogical, and acceptance from the outside world.
As future work we will compare the three programming languages currently used for CS1 curricular units: compare the simplicity, the IDE, the debugger and other features that have been identified in this article.
There isn't, and probably never will be, consensus as to which language should be chosen to introduce the student in the world of computer science.
The first programming language of a future computer science professional is just the beginning of a long walk.