SAS Enterprise Guide 1: Querying and Reporting

Lesson 1

SAS® Enterprise Guide® 1: Querying and Reporting 1
Copyright © 2011 SAS Institute Inc., Cary, NC, USA. All rights reserved.
 
Introduction to SAS Enterprise Guide

SAS Enterprise Guide is a Microsoft Windows client application that includes an intuitive visual interface, transparent data access, interactive tasks and queries for data analysis, export capabilities, a programming interface, and a complete help system and tutorial.

The work you do in SAS Enterprise Guide is saved in a project file with the file extension .EGP. The objects saved in a project are pointers to data sources, SAS programs and logs, tasks and queries, results, and any informational notes that you add.

Behind the scenes, SAS Enterprise Guide communicates with SAS to manipulate data and generate reports. SAS could be running on a Windows, Unix, or Mainframe server. It could also be running on the same PC where SAS Enterprise Guide is running. SAS Enterprise Guide must be configured to communicate with SAS either on the server or on the PC.

As you point and click in SAS Enterprise Guide, SAS code is generated. When you run a task, the code is sent to SAS for execution. SAS does all the processing and then sends the results back to SAS Enterprise Guide to be displayed on the client machine.

A typical workflow in SAS Enterprise Guide includes the following steps:

  • create a new project
  • add data to the project
  • run tasks or queries on the data
  • view the results

Lesson 2

Working with SAS Data Sets
The terms table, column, and row are commonly used across different types of databases. In SAS, a table is usually called a data set. You may also hear the term variable for a column, and observation for a row.
The descriptor portion of a SAS data set contains the data set attributes, such as the name, storage location, modified date, column attributes, and the number of rows and columns. You can view the properties of a data set by opening the Properties window from the data pop-up menu.
All columns in a SAS data set must have three attributes: a name, a type, and a length. In addition, they can have a format, informat, and label.
  • According to standard SAS naming rules, column names can be 1 to 32 characters long. The name must start with a letter or underscore, and the remaining characters can be any combination of letters, underscores, or numbers.
  • A column’s type is either character or numeric. Character variables can store any values, such as letters, numbers, special characters, and blanks. Numeric variables can store only numeric values, which can include the digits 0 through 9, a minus sign (-), a single decimal point (.), and E for scientific notation. Date and time variables are numeric variables with a format applied. A SAS date value is the number of days since 01JAN1960. A SAS time value is the number of seconds since midnight.
  • A column's length indicates the number of bytes used to store it. Character values are stored with a length of 1 to 32,767 bytes. One byte equals one character. All numeric variables have a default length of 8 bytes. Numeric values, no matter how many digits they contain, are stored as floating-point numbers in 8 bytes of storage. This provides space for 16 or 17 significant digits.
  • A format can be applied to data values to control how they are displayed in reports and in the data grid. An informat is an instruction that SAS Enterprise Guide uses to read data values into a column.
  • Labels for columns can be up to 256 characters in length and can include spaces and special symbols.

Importing Structured Data in Other Formats

You can use the Import Data wizard to import structured data, such as Microsoft Excel files and text files to create a SAS data set that you can use in tasks and queries. The options you specify in the wizard vary by the type of file you are importing.

SAS data sets that you create in SAS Enterprise Guide are stored in libraries. A SAS library is a collection of SAS files that is recognized as one unit by SAS. Each SAS library has a unique name that tells SAS where the data is stored and how to access it. Work is a temporary library that is automatically created by SAS. When you exit SAS Enterprise Guide, the files in the Work library are deleted. SAS libraries are generally defined by your SAS administrator.

Importing Text Files

You can use the Import Data wizard to create a SAS data set from a text or raw data file. The text file must either have data arranged in fixed positions or have data separated by delimiters.

Accessing Remote Data

The most efficient way to access remote data is through one or more SAS libraries. Typically SAS libraries are set up by a SAS administrator after the appropriate software is installed and configured on a server. There is no limit to the number of libraries that can be defined.

If the appropriate software is installed on the server, a SAS library can point to a collection of data tables in a DBMS.

You can use the Assign Project Library wizard to define temporary libraries to SAS or DBMS data sources to which you have read permission.

You can also view and access remote files and data sources in folders and libraries by using the Open Data window or by using the Server List.

Lesson 3

Introduction to Tasks and Wizards
The real power of SAS Enterprise Guide is in tasks: specific types of analyses, reports, or data manipulations that you can run on data in your project. The output of a task is generally a report, a data set, or both. Each task has its own window with options that relate to that task. As you point and click, the task generates SAS code and then submits it to SAS.
 
Some tasks also have wizard versions. Wizards do not have all the functionality of a task but they are generally easier to use because they guide you step-by-step to generate results. 
 
Before you open a task or wizard, you generally start by selecting the data that you want to use. To select data, you can either open the data in the data grid, or you can simply highlight the data in the Project Tree or Process Flow. This makes the data the active data source. 
Although each task has a separate window, most task windows have a similar format. On the left side of the task window is a selection pane with categories of options that are available for the particular task. When you click the category name in the selection pane, the right side of the window displays the specific options that you can set. The Data, Titles, and Properties categories are common to most tasks. 
 
On the task Data page, you can do the following:
  • change the input data source and create a filter
  • assign variables to task roles
  • modify the properties of columns in the task
On the Titles page, you can specify custom titles or footnotes for the report. 
 
On the Properties page, you can do the following:
  • specify a label for the task
  • limit the number of rows used for processing
  • specify different result formats for the task.
After you run a task, the Modify Task button reopens the task or wizard and enables you to make changes to the options that you specified when you last ran the task. If you make changes, you must run the task again to refresh the report. The Refresh button enables you to rerun the task without changing any options.