Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull grant and grantee data from HSES on a schedule #248

Merged
merged 31 commits into from
Jan 13, 2021
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
95b625a
add cf login to db instructions
SarahJaine Dec 23, 2020
85bc5c4
add url
SarahJaine Dec 23, 2020
42e4036
Pull grant and grantee data from HSES
kryswisnaskas Jan 6, 2021
39771a5
Remove commented out code
kryswisnaskas Jan 6, 2021
04e8027
Test periodic runs
kryswisnaskas Jan 6, 2021
18df146
Merge branch 'main' of https://github.com/adhocteam/Head-Start-TTADP …
kryswisnaskas Jan 6, 2021
002397a
Refactor tests
kryswisnaskas Jan 7, 2021
4f9476a
Add HSES data environment variables
kryswisnaskas Jan 8, 2021
ad4206c
Adjust back the cron schedule
kryswisnaskas Jan 8, 2021
958800b
Temporary change to allow testing of cron schedule
kryswisnaskas Jan 8, 2021
1f7bcaa
Merge branch 'main' into sj-add-db-instructions
jasalisbury Jan 8, 2021
e94a400
Merge pull request #101 from adhocteam/sj-add-db-instructions
jasalisbury Jan 8, 2021
5b107ad
Merge branch 'main' into kw-pull-hs-data
kryswisnaskas Jan 8, 2021
4aaa919
Obfuscate PII
kryswisnaskas Jan 8, 2021
34379fb
Merge branch 'kw-pull-hs-data' of https://github.com/adhocteam/Head-S…
kryswisnaskas Jan 8, 2021
6a0744b
Merge pull request #109 from adhocteam/kw-pull-hs-data
kryswisnaskas Jan 8, 2021
00417ac
Refactor test, auto increment; adjust importing of goals
kryswisnaskas Jan 12, 2021
73f7b85
Merge branch 'main' into kw-pull-hs-data
kryswisnaskas Jan 12, 2021
c5ea237
Add seeding to the backend test
kryswisnaskas Jan 12, 2021
e3b24b1
Merge branch 'kw-pull-hs-data' of https://github.com/adhocteam/Head-S…
kryswisnaskas Jan 12, 2021
6e60b21
Fix backend test ci
kryswisnaskas Jan 12, 2021
e6672d9
Update README and .env.example
Jan 12, 2021
366c25e
Remove commented out code
kryswisnaskas Jan 12, 2021
bb16a59
Remove empty migration
kryswisnaskas Jan 12, 2021
1fedfbe
Added exit on error when unable to find a grant in the db
kryswisnaskas Jan 12, 2021
fb6d9c0
Merge pull request #110 from adhocteam/kw-pull-hs-data
kryswisnaskas Jan 12, 2021
aa86d8e
Merge branch 'main' into update-readme
dcmcand Jan 12, 2021
ae1dc3c
Merge pull request #111 from adhocteam/update-readme
dcmcand Jan 12, 2021
33998aa
Remove commented out line; correct variable names
kryswisnaskas Jan 13, 2021
76d0ec7
Merge branch 'main' into kw-pull-hs-data
kryswisnaskas Jan 13, 2021
62feb90
Merge pull request #112 from adhocteam/kw-pull-hs-data
kryswisnaskas Jan 13, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 29 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,15 @@ commands:
description: "Name of CircleCI project environment variable that
holds the New Relic License key, a required application variable"
type: env_var_name
hses_data_file_url:
description: "Url to download HSES grants and grantee data from"
type: env_var_name
hses_data_username:
description: "Username used to access the HSES grants and grantee data"
type: env_var_name
hses_data_password:
description: "Password used to access the HSES grants and grantee data"
type: env_var_name
steps:
- run:
name: Login with service account
Expand All @@ -86,7 +95,10 @@ commands:
--var AUTH_CLIENT_ID=${<< parameters.auth_client_id >>} \
--var AUTH_CLIENT_SECRET=${<< parameters.auth_client_secret >>} \
--var NEW_RELIC_LICENSE_KEY=${<< parameters.new_relic_license >>} \
--var SESSION_SECRET=${<< parameters.session_secret >>}
--var SESSION_SECRET=${<< parameters.session_secret >>} \
--var HSES_DATA_FILE_URL=${<< parameters.hses_data_file_url >>} \
rahearn marked this conversation as resolved.
Show resolved Hide resolved
--var HSES_DATA_USERNAME=${<< parameters.hses_data_username >>} \
--var HSES_DATA_PASSWORD=${<< parameters.hses_data_password >>}
parameters:
cg_org:
description: "Cloud Foundry cloud.gov organization name"
Expand Down Expand Up @@ -121,7 +133,7 @@ parameters:
default: "main"
type: string
sandbox_git_branch: # change to feature branch to test deployment
default: "js-fix-user-autoinc"
default: "kw-pull-hs-data"
type: string
jobs:
build_and_lint:
Expand Down Expand Up @@ -168,6 +180,9 @@ jobs:
- run:
name: Run migrations ci
command: yarn db:migrate:ci
- run:
name: Run seeders
command: yarn db:seed:ci
- run:
name: Test backend
command: yarn test:ci
Expand Down Expand Up @@ -302,6 +317,9 @@ jobs:
deploy_config_file: deployment_config/sandbox_vars.yml
new_relic_license: NEW_RELIC_LICENSE_KEY
session_secret: SANDBOX_SESSION_SECRET
hses_data_file_url: HSES_DATA_FILE_URL
hses_data_username: HSES_DATA_USERNAME
hses_data_password: HSES_DATA_PASSWORD
- run:
name: Return database to neutral, then migrate and seed
command: |
Expand All @@ -323,6 +341,9 @@ jobs:
deploy_config_file: deployment_config/dev_vars.yml
new_relic_license: NEW_RELIC_LICENSE_KEY
session_secret: DEV_SESSION_SECRET
hses_data_file_url: HSES_DATA_FILE_URL
hses_data_username: HSES_DATA_USERNAME
hses_data_password: HSES_DATA_PASSWORD
- run:
name: Undo database seeding, then migrate and seed
command: |
Expand All @@ -343,6 +364,9 @@ jobs:
deploy_config_file: deployment_config/staging_vars.yml
new_relic_license: NEW_RELIC_LICENSE_KEY
session_secret: STAGING_SESSION_SECRET
hses_data_file_url: HSES_DATA_FILE_URL
hses_data_username: HSES_DATA_USERNAME
hses_data_password: HSES_DATA_PASSWORD
- run:
name: Run database migrations
command: |
Expand All @@ -363,6 +387,9 @@ jobs:
deploy_config_file: deployment_config/prod_vars.yml
new_relic_license: PROD_NEW_RELIC_LICENSE_KEY
session_secret: PROD_SESSION_SECRET
hses_data_file_url: HSES_DATA_FILE_URL
hses_data_username: HSES_DATA_USERNAME
hses_data_password: HSES_DATA_PASSWORD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wish I had caught this earlier. We have one endpoint for dev/sandbox/staging, and another for production, so the prod variables need to be different than what we use for staging.

Suggested change
hses_data_file_url: HSES_DATA_FILE_URL
hses_data_username: HSES_DATA_USERNAME
hses_data_password: HSES_DATA_PASSWORD
hses_data_file_url: PROD_HSES_DATA_FILE_URL
hses_data_username: PROD_HSES_DATA_USERNAME
hses_data_password: PROD_HSES_DATA_PASSWORD

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Cut and paste omission.

- run:
name: Run database migrations
command: |
Expand Down
7 changes: 6 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ POSTGRES_USERNAME=postgres
POSTGRES_PASSWORD=something_secret
POSTGRES_DB=ttasmarthub
POSTGRES_HOST=localhost
# Add Oauth client id for local development from "Development Credentials" document here.
AUTH_CLIENT_ID=clientId
# Add Oauth client secret for local development from "Development Credentials" document here.
AUTH_CLIENT_SECRET=clientSecret
SESSION_SECRET=secret
TTA_SMART_HUB_URI=http://localhost:3000
Expand All @@ -19,11 +21,14 @@ AUTH_BASE=https://uat.hsesinfo.org
REDIRECT_URI_HOST=http://localhost:8080
# CURRENT_USER_ID controls the logged in user when BYPASS_AUTH is set to true.
# This only works in non-production environments
CURRENT_USER_ID=1;
CURRENT_USER_ID=1
# NEW_RELIC_LICENSE_KEY can be omitted in local development
NEW_RELIC_LICENSE_KEY=secret_key
# Set to false to require user to go through auth flow, never true in production envs
BYPASS_AUTH=true
HSES_DATA_FILE_URL=url
HSES_DATA_USERNAME=username
HSES_DATA_PASSWORD=password
# In production, Sequelize instance is created with a postgres URI.
# This URI is automatically dropped into the cloud.gov environment as the env variable DATABASE_URL
DATABASE_URL=secret
35 changes: 29 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,27 @@ For the latest on our product mission, goals, initiatives, and KPIs, see the [Pr
## Getting Started

### Set up

Make sure Docker is installed. To check run `docker ps`

Run `yarn docker:deps`. This builds the frontend and backend docker containers and install dependencies. You only need to run this step the first time you fire up the app and when dependencies are added/updated/removed. Running `yarn docker:start` starts the backend and frontend, browse to `http://localhost:3000` to hit the frontend and `http://localhost:3000/api` to hit the backend. Copying `.env.example` to `.env`, substituting in your user id and group id will cause any files created in docker containers to be owned by your user on your host.

You can also run build commands directly on your host (without docker). Make sure you install dependencies when changing execution method. You could see some odd errors if you install dependencies for docker and then run yarn commands directly on the host, especially if you are developing on windows. If you want to use the host yarn commands be sure to run `yarn deps:local` before any other yarn commands. Likewise if you want to use docker make sure you run `yarn docker:deps`.
#### Docker

1. Make sure Docker is installed. To check run `docker ps`.
2. Make sure you have Node 12.20.0 installed.
3. Run `yarn docker:deps`. This builds the frontend and backend docker containers and install dependencies. You only need to run this step the first time you fire up the app and when dependencies are added/updated/removed.
4. Copy `.env.example` to `.env`.
6. Change the `AUTH_CLIENT_ID` and `AUTH_CLIENT_SECRET` variables to to values found in the "Values for local development" section of the "Development Credentials" document. If you don't have access to this document, please ask in the hs-vendors-ohs-tta channel of the gsa-tts slack channel.
7. Optionally, set `CURRENT_USER` to your current user's uid:gid. This will cause files created by docker compose to be owned by your user instead of root.
8. Run `yarn docker:db:migrate` to run DB migrations
9. Run `yarn docker:db:seed` to seed the database with test data.
10. Run `yarn docker:start` to start the application. The frontend will be available on `localhost:3000` and the backend will run on `localhost:8080`.
11. Run `yarn docker:stop` to stop the servers and remove the docker containers.

The frontend [proxies requests](https://create-react-app.dev/docs/proxying-api-requests-in-development/) to paths it doesn't recognize to the backend.

Api documentation uses [Redoc](https://github.com/Redocly/redoc) to serve documentation files. These files can be found in the `docs/openapi` folder. Api documentation should be split into separate files when appropriate to prevent huge hard to grasp yaml files.

#### Local build

You can also run build commands directly on your host (without docker). Make sure you install dependencies when changing execution method. You could see some odd errors if you install dependencies for docker and then run yarn commands directly on the host, especially if you are developing on windows. If you want to use the host yarn commands be sure to run `yarn deps:local` before any other yarn commands. Likewise if you want to use docker make sure you run `yarn docker:deps`.

### Running Tests

Run `yarn docker:deps` to install dependencies. Run `yarn docker:db:migrate` and `yarn docker:test` to run all tests for the frontend and backend.
Expand Down Expand Up @@ -215,6 +225,18 @@ Our project includes four deployed Postgres databases, one to interact with each

You can run psql commands directly against a deployed database by following these directions.

1. Install Cloud Foundry CLI tool

- On MacOS: `brew install cloudfoundry/tap/cf-cli`
- On other platforms: [Download and install cf][cf-install]

1. Login to cloud.gov account

```bash
cf login -a api.fr.cloud.gov --sso
# follow temporary authorization code prompts
```

1. Install the cloud foundry plugin [cf-service-connect][cf-service-connect]

```bash
Expand Down Expand Up @@ -247,6 +269,7 @@ You can run psql commands directly against a deployed database by following thes
[circleci-envvar]: https://app.circleci.com/settings/project/github/adhocteam/Head-Start-TTADP/environment-variables?return-to=https%3A%2F%2Fcircleci.com%2Fdashboard
[cloudgov]: https://dashboard.fr.cloud.gov/home
[cloudgov-deployer]: https://cloud.gov/docs/services/cloud-gov-service-account/
[cf-install]: https://docs.cloudfoundry.org/cf-cli/install-go-cli.html
[cf-service-connect]: https://github.com/cloud-gov/cf-service-connect
[hhs-main]: https://github.com/HHS/Head-Start-TTADP/tree/main
[hhs-prod]: https://github.com/HHS/Head-Start-TTADP/tree/production
Binary file added hses.zip
Binary file not shown.
3 changes: 3 additions & 0 deletions manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ applications:
REDIRECT_URI_HOST: ((REDIRECT_URI_HOST))
SESSION_SECRET: ((SESSION_SECRET))
TTA_SMART_HUB_URI: ((TTA_SMART_HUB_URI))
HSES_DATA_FILE_URL: ((HSES_DATA_FILE_URL))
HSES_DATA_USERNAME: ((HSES_DATA_USERNAME))
HSES_DATA_PASSWORD: ((HSES_DATA_PASSWORD))
services:
- ((rds_instance))
- ((s3_doc_upload_bucket))
4 changes: 4 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,11 @@
"dependencies": {
"@babel/runtime": "^7.12.1",
"@cucumber/cucumber": "^7.0.0-rc.0",
"adm-zip": "^0.5.1",
"axios": "^0.21.1",
"chromedriver": "^87.0.0",
"client-oauth2": "^4.3.3",
"cron": "^1.8.2",
"csv-parse": "^4.14.1",
"cucumber-html-reporter": "^5.2.0",
"dotenv": "^8.2.0",
Expand All @@ -146,6 +148,7 @@
"http-codes": "^1.0.0",
"lodash": "^4.17.20",
"memorystore": "^1.6.2",
"mz": "^2.7.0",
"newrelic": "^7.0.1",
"pg": "^8.3.3",
"puppeteer": "^5.3.1",
Expand All @@ -154,6 +157,7 @@
"sequelize-cli": "^6.2.0",
"url-join": "^4.0.1",
"winston": "^3.3.3",
"xml2json": "^0.12.0",
"yargs": "^16.1.1"
}
}
21 changes: 21 additions & 0 deletions src/app.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ import memorystore from 'memorystore';
import path from 'path';
import join from 'url-join';
import { INTERNAL_SERVER_ERROR } from 'http-codes';
import { CronJob } from 'cron';
import { hsesAuth } from './middleware/authMiddleware';
import updateGrantsGrantees from './lib/updateGrantsGrantees';

import findOrCreateUser from './services/accessValidation';

Expand Down Expand Up @@ -77,4 +79,23 @@ if (process.env.NODE_ENV === 'production') {
});
}

// Set timing parameters.
// Run at midnight
const schedule = '0 0 * * *';
const timezone = 'America/New_York';

const runJob = () => {
try {
updateGrantsGrantees();
} catch (error) {
logger.error(`Error processing HSES file: ${error}`);
}
};

// Run only on one instance
if (process.env.CF_INSTANCE_INDEX === '0') {
const job = new CronJob(schedule, () => runJob(), null, true, timezone);
job.start();
}

module.exports = app;
124 changes: 124 additions & 0 deletions src/lib/updateGrantsGrantees.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
import AdmZip from 'adm-zip';
import { toJson } from 'xml2json';
import {} from 'dotenv/config';
rahearn marked this conversation as resolved.
Show resolved Hide resolved
import axios from 'axios';
import {
Grantee, Grant,
} from '../models';
import logger from '../logger';

const fs = require('mz/fs');
/**
* Reads HSES data files that were previously extracted to the "temp" directory.
* The files received from HSES are:
*
* agency.xml - grantee and grantee that are delegates
* grant_agency.xml - junction between grants and agencies
* grant_award.xml - grants
* grant_award_replacement.xml
* grant_program.xml
*
* The grantee data is them filtered to exclude delegates
*
*/
export async function processFiles() {
let grantGrantees;
let grants;
const granteesForDb = [];
const grantsForDb = [];

try {
const grantAgencyData = await fs.readFile('./temp/grant_agency.xml');
const json = toJson(grantAgencyData);
const grantAgency = JSON.parse(json);
// we are only interested in non-delegates
grantGrantees = grantAgency.grant_agencies.grant_agency.filter(
(g) => g.grant_agency_number === '0',
);

// process grantees aka agencies that are non-delegates
const agencyData = await fs.readFile('./temp/agency.xml');
const agency = JSON.parse(toJson(agencyData));

// filter out delegates by matching to the non-delegates
// eslint-disable-next-line max-len
const granteesNonDelegates = agency.agencies.agency.filter((a) => grantGrantees.some((gg) => gg.agency_id === a.agency_id));

const hubGranteeIds = await Grantee.findAll({ attributes: ['id'] }).map((hgi) => hgi.id);

// process grants
const grantData = await fs.readFile('./temp/grant_award.xml');
const grant = JSON.parse(toJson(grantData));

// Check if the grantee id already exists in the smarthub db OR if it belongs to
// at least one active grant. grant_award data structure includes agency_id
// eslint-disable-next-line max-len
const grantees = granteesNonDelegates.filter((gnd) => hubGranteeIds.some((id) => id.toString() === gnd.agency_id)
|| grant.grant_awards.grant_award.some((ga) => ga.agency_id === gnd.agency_id && ga.grant_status === 'Active'));

grantees.forEach((g) => granteesForDb.push({
id: parseInt(g.agency_id, 10),
rahearn marked this conversation as resolved.
Show resolved Hide resolved
name: g.agency_name,
}));

await Grantee.bulkCreate(granteesForDb,
{
updateOnDuplicate: ['name', 'updatedAt'],
});

const hubGrantIds = await Grant.findAll({ attributes: ['id'] }).map((hgi) => hgi.id);

grants = grant.grant_awards.grant_award.filter((ga) => hubGrantIds.some((id) => id.toString() === ga.grant_award_id) || ga.grant_status === 'Active');

grants.forEach((g) => grantsForDb.push({
id: parseInt(g.grant_award_id, 10),
number: g.grant_number,
regionId: parseInt(g.region_id, 10),
granteeId: parseInt(g.agency_id, 10),
status: g.grant_status,
startDate: g.grant_start_date,
endDate: g.grant_end_date,
}));

await Grant.bulkCreate(grantsForDb,
{
updateOnDuplicate: ['number', 'regionId', 'granteeId', 'status', 'startDate', 'endDate', 'updatedAt'],
});
} catch (error) {
logger.error(`Error reading or updating database on HSES data import: ${error.message}`);
throw error;
}
}

// reading archives
const zip = new AdmZip('./hses.zip');
rahearn marked this conversation as resolved.
Show resolved Hide resolved

/**
* Downloads the HSES grantee/grant zip, extracts to the "temp" directory
* and calls processFiles to parse xml data and populate the Smart Hub db
*
* Note - file download needs to happen in deployed environments
*/
export default async function updateGrantsGrantees() {
try {
if (process.env.NODE_ENV === 'production') {
const response = await axios(process.env.HSES_DATA_FILE_URL, {
method: 'get',
url: process.env.HSES_DATA_FILE_URL,
responseType: 'stream',
auth: {
username: process.env.HSES_DATA_USERNAME,
password: process.env.HSES_DATA_PASSWORD,
},
});

await response.data.pipe(fs.createWriteStream('hses.zip'));
}
// extract to target path. Pass true to overwrite
zip.extractAllTo('./temp', true);

await processFiles();
} catch (error) {
logger.error(error);
}
}
1 change: 1 addition & 0 deletions src/migrations/20201029214432-add-title-to-user.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ module.exports = {
'homeRegionId',
{
type: Sequelize.INTEGER,
allowNull: true,
references: {
model: 'Regions',
key: 'id',
Expand Down
Loading