Skip to content

Adding Apache Superset to openctest framework #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,15 @@ A prototype for generating and running ctests. Below are the projects we current
- Hbase 2.2.2: `hbase-server`.
- ZooKeeper 3.5.6: `zookeeper-server`.
- Alluxio 2.1.0: `core`.
- Superset 3.0.2: `superset-websocket`.

We also provided our instrumented versions of the above projects:

- Hadoop 2.8.5: https://github.com/xlab-uiuc/hadoop
- Hbase 2.2.2: https://github.com/xlab-uiuc/hbase
- ZooKeeper 3.5.6: https://github.com/xlab-uiuc/zookeeper
- Alluxio 2.1.0: https://github.com/xlab-uiuc/alluxio
- Superset 3.0.2: https://github.com/ishitakarna/superset

Our instrumented version projects have two branches:
- `ctest-injection`: branch with "Intercept Configuration API" instrumentation (See `ADDING_NEW_PROJECT.md`). This branch is used by `generate_ctest` and `run_ctest`.
Expand Down Expand Up @@ -59,7 +61,7 @@ To generate ctests or run ctest, you need to first clone the target project.
1. In `openctest/core`, run `./add_project.sh <main project>` to clone the project, switch to and build the branch `ctest-injection`. This branch will be later used by `generate_ctest` and `run_ctest`.
2. In `openctest/core/identify_param`, run `./add_project.sh <main project>` to clone the project, switch to and build the branch `ctest-logging`. This branch will be later used by `identify_param`.

`<main project>` can be `hadoop`, `hbase`, `zookeeper` or `alluxio`.
`<main project>` can be `hadoop`, `hbase`, `zookeeper`, `alluxio`, or `superset`.

## Usage

Expand Down
8 changes: 8 additions & 0 deletions core/add_project.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,13 @@ function setup_alluxio() {
mvn clean install -DskipTests -Dcheckstyle.skip -Dlicense.skip -Dfindbugs.skip -Dmaven.javadoc.skip=true
}

function setup_superset() {
[ ! -d "app/superset" ] && git clone https://github.com/ishitakarna/superset app/superset
cd app/superset
git fetch && git checkout ctest
npm install
}

function usage() {
echo "Usage: add_project.sh <main project>"
exit 1
Expand All @@ -64,6 +71,7 @@ function main() {
hbase) setup_hbase ;;
zookeeper) setup_zookeeper ;;
alluxio) setup_alluxio ;;
superset) setup_superset ;;
*) echo "Unexpected project: $project - only support hadoop, hbase, zookeeper and alluxio." ;;
esac
fi
Expand Down
23 changes: 23 additions & 0 deletions core/default_configs/superset-default.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
gcChannelsIntervalMs 120000 Time interval for garbage collecting inactive channels
jwtAlgorithms.0 HS256 First algorithm used for JSON Web Token (JWT) encoding/decoding
jwtChannelIdKey channel Key name in JWT for identifying the channel
jwtCookieName test-async-token Name of the cookie used for storing the JWT
jwtSecret test123-test123-test123-test123-test123-test123-test123 This is the secret key used for signing JSON Web Tokens (JWTs)
logFilename app.log Name of the file where logs are written
logLevel info Severity level of the logs being recorded (e.g., info, debug)
logToFile FALSE Boolean indicating whether to log to a file (true) or not (false)
pingSocketsIntervalMs 20000 Interval for pinging sockets to check their connectivity
port 8125 Network port used for connecting
redis.db 10 This parameter specifies the database number to be used in the Redis data store
redis.host 127.0.0.1 This parameter specifies the hostname or IP address of the Redis server
redis.password some pwd This is the password used for authentication to the Redis server
redis.port 6379 This defines the port number on which the Redis server is running and accepting connections
redis.ssl FALSE A boolean parameter indicating whether SSL encryption should be used for the connection to the Redis server
redis.username default Username for authenticating with the Redis server
redis.validateHostname TRUE Boolean indicating whether to validate the Redis server's hostname
redisStreamPrefix test-async-events- Prefix for Redis stream keys
redisStreamReadBlockMs 5000 Time in milliseconds to block when reading from a Redis stream
redisStreamReadCount 100 Number of messages to read from a Redis stream in one go
socketResponseTimeoutMs 60000 Timeout in milliseconds for socket responses
statsd.host 127.0.0.1 This is the hostname or IP address of the StatsD server
statsd.port 8125 This parameter specifies the port number on which the StatsD server is listening
91 changes: 91 additions & 0 deletions core/generate_ctest/gen_ctest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import subprocess
import csv
import json
import os


def read_tsv(file_path):
params = {}
with open(file_path) as file:
reader = csv.reader(file, delimiter='\t')
for row in reader:
param, *values = row
params[param] = values
return params


def read_json(file_path):
with open(file_path) as file:
return json.load(file)


def run_test_case(param, value, test_cases, report):
for test_case in test_cases:
test_file, test_name = test_case.split(": ")
test_name = test_name.replace('_', ' ')
test_file_path = f"/home/ikarna2/project/fork/superset/superset-websocket/spec{test_file}"


# Generate the JSON configuration command
gen_json_cmd = ["python3", "gen_json.py", param, value, "config.test.override.json"]
print("Executing command:", ' '.join(gen_json_cmd))
subprocess.run(gen_json_cmd)

# Run the test case command
test_cmd = ["python3", "ctest_runner_json_override.py", "config.test.override.json",
"/home/ikarna2/project/fork/superset/superset-websocket", test_file_path, f'"{test_name}"']
print("Executing command:", ' '.join(test_cmd))
result = subprocess.run(test_cmd, capture_output=True, text=True)


# Process the result
if result.returncode == 0:
test_result = 'p'
else:
test_result = 'f'

# Assuming the execution time is part of the result output
execution_time = extract_execution_time(result.stdout)


report.append([param, test_case, value, test_result, execution_time])




def extract_execution_time(output):
# Extract execution time from the output
# Placeholder for actual implementation
return "time_placeholder"


def write_report(report, file_path):
with open(file_path, 'w', newline='') as file:
writer = csv.writer(file)
for row in report:
writer.writerow(row)


def main():
tsv_file = 'default_params.tsv' # Placeholder for TSV file path
json_file = 'flattened_params_to_tests.json' # Placeholder for JSON file path
report_file = 'test_report.csv'


params = read_tsv(tsv_file)
test_mapping = read_json(json_file)
report = []


for param, values in params.items():
test_cases = test_mapping.get(param, [])
for value in values:
if value != 'SKIP':
run_test_case(param, value, test_cases, report)


write_report(report, report_file)


if __name__ == "__main__":
main()
61 changes: 61 additions & 0 deletions core/generate_ctest/gen_json.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import json
import sys


def create_json_from_flattened_key(key, value):
parts = key.split('.')
json_obj = {}
current_level = json_obj


for i, part in enumerate(parts):
is_last_part = i == len(parts) - 1


if part.isdigit(): # If part is a number, handle as a list
part = int(part)
# Ensure current level is a list
if not isinstance(current_level, list):
current_level = []
# Expand the list if necessary
while len(current_level) <= part:
current_level.append({} if not is_last_part else None)
if is_last_part:
current_level[part] = value
else:
current_level = current_level[part]
else: # Handle as a dict
if part not in current_level:
current_level[part] = [] if (i+1 < len(parts) and parts[i+1].isdigit()) else {}
if is_last_part:
current_level[part] = value
else:
current_level = current_level[part]


return json_obj


def write_json_to_file(json_data, filename):
with open(filename, 'w') as file:
json.dump(json_data, file, indent=4)


def main():
if len(sys.argv) != 4:
print("Usage: script.py <flattened_key> <value> <output_file>")
sys.exit(1)


flattened_key = sys.argv[1]
value = sys.argv[2]
output_file = sys.argv[3]


json_data = create_json_from_flattened_key(flattened_key, value)
write_json_to_file(json_data, output_file)
print(f"JSON data written to {output_file}")


if __name__ == "__main__":
main()
Loading