Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acouch patch 1 #2174

Closed
wants to merge 60 commits into from
Closed

Acouch patch 1 #2174

wants to merge 60 commits into from

Conversation

acouch
Copy link
Collaborator

@acouch acouch commented Sep 18, 2024

Summary

Fixes #{ISSUE}

Time to review: x mins

Changes proposed

What was added, updated, or removed in this PR.

Context for reviewers

Testing instructions, background context, more in-depth details of the implementation, and anything else you'd like to call out or ask reviewers. Explain how the changes were verified.

Additional information

Screenshots, GIF demos, code examples or output to help show the changes working as expected.

acouch and others added 30 commits May 14, 2024 15:39
## Summary
Fixes HHS#1757 

## Changes proposed
- Add e2e tests for search page
- Add e2e test (with FE and API) to its own CI job
(`ci-frontend-ci.yml`)
    - invokes shell script to wait until API is loaded
## Summary
Fixes #37 

## Changes proposed
- add some of the relevant tests from bug bash
## Summary
Fixes HHS#1957

## Changes proposed
- Update sortby labels and ordering
### Time to review: __1 mins__

## Changes proposed
Needed to upgrade dependencies for the API for grype issue:
https://github.com/navapbc/simpler-grants-gov/actions/runs/9180615894/job/25245519194?pr=47

## Additional information
As usual, just ran `poetry update`
## Summary
Fixes #9

### Time to review: __10 mins__

## Changes proposed
Setup a search index to run locally via Docker

Updated makefile to automatically initialize the index + added a script
to wait for the index to start up before proceeding.

Setup a very basic client for connecting to the search index (will be
expanded more in subsequent PRs)

Basic test / test utils to verify it is working (also will be expanded)

## Context for reviewers
This is the first step in getting the search index working locally. This
actually gets it running, and the client works, we just aren't doing
anything meaningful with it yet besides tests.

## Additional information

This doesn't yet create an index that we can use, except in the test.
However, if you want to test out a search index, you can go to
http://localhost:5601/app/dev_tools#/console (after running `make init`)
to run some queries against the (one node) cluster.
https://opensearch.org/docs/latest/getting-started/communicate/#sending-requests-in-dev-tools
provides some examples of how to create + use indexes that you can
follow.
… the index (#44)

## Summary
Fixes #12

### Time to review: __5 mins__

## Changes proposed
Made a new set of v1 endpoints that are basically copy-pastes of the
v0.1 opportunity endpoints

## Context for reviewers
Some changes I want to make to the schemas wouldn't make sense without
the search index (eg. adding the filter counts to the response). As we
have no idea what the actual launch of the v0.1 endpoint is going to
look like, I don't want to mess with any of that code or try to make a
weird hacky approach that needs to account for both the DB
implementation and the search index one.

Also, I think we've heard that with the launch of the search index,
we'll be "officially" launched, so might as well call in v1 at the same
time.

Other than adjusting the names of a few schemas in v0.1, I left that
implementation alone and just copied the boilerplate that I'll fill out
in subsequent tickets.

## Additional information
The endpoint appears locally:
![Screenshot 2024-05-20 at 12 18 32
PM](https://github.com/navapbc/simpler-grants-gov/assets/46358556/86231ec1-417a-41c6-ad88-3d06bb6214e5)

---------

Co-authored-by: nava-platform-bot <[email protected]>
## Summary
Fixes #10

### Time to review: __10 mins__

## Changes proposed
Setup a script to populate the search index by loading opportunities
from the DB, jsonify'ing them, loading them into a new index, and then
aliasing that index.

Several utilities were created for simplifying working with the
OpenSearch client (a wrapper for setting up configuration / patterns)

## Context for reviewers
Iterating over the opportunities and doing something with them is a
common pattern in several of our scripts, so nothing is really different
there.

The meaningful implementation is how we handle creating and aliasing the
index. In OpenSearch you can give any index an alias (including putting
multiple indexes behind the same alias). The approach is pretty simple:
* Create an index
* Load opportunities into the index
* Atomically swap the index backing the `opportunity-index-alias`
* Delete the old index if they exist

This approach means that our search endpoint just needs to query the
alias, and we can keep making new indexes and swapping them out behind
the scenes. Because we could remake the index every few minutes, if we
ever need to re-configure things like the number of shards, or any other
index-creation configuration, we just update that in this script and
wait for it to run again.

## Additional information
I ran this locally after loading `83250` records, and it took about 61s.

You can run this locally yourself by doing:
```sh
make init
make db-seed-local
poetry run flask load-search-data load-opportunity-data
```

If you'd like to see the data, you can test it out on
http://localhost:5601/app/dev_tools#/console - here is an example query
that filters by the word `research` across a few fields and filters to
just forecasted/posted.

```json
GET opportunity-index-alias/_search
{
  "size": 25,
  "from": 0,
  "query": {
    "bool": {
      "must": [
        {
          "simple_query_string": {
            "query": "research",
            "default_operator": "AND", 
            "fields": ["agency.keyword^16", "opportunity_title^2", "opportunity_number^12", "summary.summary_description", "opportunity_assistance_listings.assistance_listing_number^10", "opportunity_assistance_listings.program_title^4"]
          }
        }
      ],
      "filter": [
        {
          "terms": {
            "opportunity_status": [
              "forecasted",
              "posted"
            ]
          }
        }
      ]
    }
  }
}

```
## Summary
Fixes #6 

### Time to review: __60 mins__

## Changes proposed

### Move pages from page to app router:

1. Move all pages to
[`[locale]`](https://next-intl-docs.vercel.app/docs/getting-started/app-router/with-i18n-routing#getting-started)
folder
2. Add
[`generateMetata()`](https://nextjs.org/docs/app/api-reference/functions/generate-metadata#generatemetadata-function)
function and [next-intl
`getTranslations()`](https://next-intl-docs.vercel.app/docs/environments/metadata-route-handlers#metadata-api)
implementation
* @rylew1 commented we could remove this from each page. To do that we
could use [prop
arguments](https://nextjs.org/docs/app/api-reference/functions/generate-metadata#with-segment-props)
and update the based on the param. There is also more we can do with the
metadata to properly add [app links and twitter
cards](https://nextjs.org/docs/app/api-reference/functions/generate-metadata#applinks).
TODO: create ticket
4. Replace i18n's `useTranslation` with next-intl's `useTranslations`
5. Remove hard-coded strings that were present b/c we were still b/w
i18next and next-intl

#### Changes
* [Move process page to
app](32ba4ee)

* [Move research page to
app](5b5ad1a)

* [Move health page to
app](a3e6255)

* [Move feature flag page to
app](395baed)
* [Move search page to app
router](1e261e3)
* [Move newsletter pages to app
router](b509ef8)
* [Move home page to app
router](de1be98)
* [Move home page to app
router](74077ae)
* [Move 404 page to app
router](ccbc956)

### Misc

1. [Delete hello
api](5bad6ea)
     * This was left from the project creation
2. [Add USWDS icon
component](0120c7b)
* as noted in a slack discussion, when trying to access [one of the
icons](https://github.com/trussworks/react-uswds/blob/main/src/components/Icon/Icons.ts)
using `<Icon.Search/>` next errors: `You cannot dot into a client module
from a server component. You can only pass the imported name through`.
I'm not sure why it thinks the Icon component is a client module. [Dan
A.
suggests](vercel/next.js#51593 (comment))
trussworks should re-export as named exports. I tried importing the SVGs
directly from the trussworks library but [svgr requires a custom webpack
config](https://react-svgr.com/docs/next/) which is a road I didn't want
to go down and [react svg](https://www.npmjs.com/package/react-svg)
through an error in the app router 😥 .
* I implemented @sawyerh 's
[suggestion](0120c7b#diff-dadb35bd2f3f61f2c179f033cd0a2874fc343974236f2fb8613664703c751429),
which did not work initially b/c next reported the USWDS icon was
corrupt, which was fixed by adding a `viewBox` to the svg element 😮‍💨 .
* [Remove unused
WtGIContent](75490f7)

### Layout and component updates
* [Move layout and update for app
router](af112fd)

* [Update global components for the app
router](40119e6)

### Remaining next-intl config and removal of 

* [Move i18n strings for app
router](eb3c07c)

* [Adds next-intl config and removes
i18n](c546571)

* [Update tests for app
router](3b9b193)

* [Removes i18next and next-i18n
packages](9d2e08a)

* [Update storybook settings for app
router](39f115d)
## Summary
Fixes #37 


## Changes proposed
- add e2e docs prereq api command
## Summary
Fixes #50

## Changes proposed
- switch to using USWDS `Select` component
## Summary
Fixes #56 

## Changes proposed
- Update the date creation to the parsed year/month/day so it creates
the `Date` object using the local time - this may prevent any type of
rounding
- Add some tests to ensure the right dates are showing up on search
…freshes (#67)

## Summary
Fixes #58

### Time to review: __3 mins__

## Changes proposed
Set the `persistAuthorization` OpenAPI config locally to True

## Context for reviewers
For local development, we frequently need to go to
http://localhost:8080/docs - enter the auth token, and then repeat this
process every time we reopen this page or refresh. Having to either copy
paste or retype in the auth token is tedious. This flag makes it so it
gets preserved in your browsers local storage.

We are only enabling this for the local endpoint at the moment as there
are possibly security implications we would need to consider non-locally
(eg. what if someone is using a public computer).
## Summary
Fixes #51 

## Changes proposed
- Add 300ms debounce on pagination
## Summary
Fixes #64 

## Changes proposed
- update to next 14.2.3 (https://nextjs.org/blog/next-14-2)
## Summary
Fixes #69

### Time to review: __5 mins__

## Changes proposed
Remove the `BASE_RESPONSE_SCHEMA` configuration.

Adjust the healthcheck endpoint to be consistent in defining the schema
/ responses with the other endpoints.

## Context for reviewers
APIFlask allows you to set a `BASE_RESPONSE_SCHEMA` - the idea is that
its the shared schema that every API endpoint will return, and they will
only differ in the `data` object within. This sounds great on paper as
it prevents you from needing to define most of the response schema for
many endpoints, but in practice, its clunky to use. If we want to modify
anything in the response schema outside of the `data` object, it will
affect every single endpoint. This means when we add something like the
pagination info to our search endpoint, a pagination object appears on
the healthcheck response. As we intend to make these docs something the
public will use, this is going to be confusing.

There is also a "bug" (unsure if it is as it was an intended change a
few months ago in APIFlask/apispec) that the error response objects end
up nested within themselves in the examples.

For example, currently the error response for the healthcheck endpoint
(which can literally only return a 5xx error) has an example response
of:
```json
{
  "data": {
    "data": "string",
    "errors": [
      {
        "field": "string",
        "message": "string",
        "type": "string"
      }
    ],
    "message": "string",
    "status_code": 0
  },
  "message": "string",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 0,
  "warnings": [
    {
      "field": "string",
      "message": "string",
      "type": "string"
    }
  ]
}
```
When in reality, the error response it actually returns looks like:
```json
{
  "data": {},
  "errors": [],
  "message": "Service Unavailable",
  "status_code": 503
}
```

This set of changes works around all of these confusing issues and just
requires us to define specific response schemas for each endpoint with
some small set of details. I've kept some base schema classes to derive
from that we can use in most cases.

## Additional information

Before & After Examples in OpenAPI

<table>
<tr><th>Endpoint</th><th>Before</th><th>After</th><th>Actual Response
(before change)</th></tr>
<tr><td>GET /health (200) </td>
<td>

```json
{
  "data": {
    "message": "string"
  },
  "message": "string",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 0,
  "warnings": [
    {
      "field": "string",
      "message": "string",
      "type": "string"
    }
  ]
}
```
</td>
<td>

```json
{
  "data": null,
  "message": "Success",
  "status_code": 200
}
```
</td>
<td>

```json
{
  "data": {},
  "message": "Service healthy",
  "pagination_info": null,
  "status_code": 200,
  "warnings": []
}
```
</td>
</tr>

<tr><td>GET /health (503)</td>
<td>

```json
{
  "data": {
    "data": "string",
    "errors": [
      {
        "field": "string",
        "message": "string",
        "type": "string"
      }
    ],
    "message": "string",
    "status_code": 0
  },
  "message": "string",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 0,
  "warnings": [
    {
      "field": "string",
      "message": "string",
      "type": "string"
    }
  ]
}
```
</td>
<td>

```json
{
  "data": {},
  "errors": [],
  "message": "Error",
  "status_code": 0
}
```
</td>
<td>

```json
{
  "data": {},
  "message": "Service unavailable",
  "pagination_info": null,
  "status_code": 200,
  "warnings": []
}
```
</td>
</tr>

<tr><td>POST /v0.1/opportunities/search (200)</td>
<td>

```json
{
  "data": [
    {.. Excluding for brevity }
  ],
  "message": "string",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 0,
  "warnings": [
    {
      "field": "string",
      "message": "string",
      "type": "string"
    }
  ]
}
```
</td>
<td>

```json
{
  "data": [
    {.. Excluding for brevity }
  ],
  "message": "Success",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 200
}
```
</td>
<td>

```json
{
  "data": [
    {}, {}, {} // excluded for brevity
  ],
  "message": "Success",
  "pagination_info": {
    "order_by": "opportunity_id",
    "page_offset": 1,
    "page_size": 3,
    "sort_direction": "ascending",
    "total_pages": 1010,
    "total_records": 3030
  },
  "status_code": 200,
  "warnings": []
}
```
</td>
</tr>

<tr><td>POST /v0.1/opportunities/search (401 or 422)</td>
<td>

```json
{
  "data": {
    "data": "string",
    "errors": [
      {
        "field": "string",
        "message": "string",
        "type": "string"
      }
    ],
    "message": "string",
    "status_code": 0
  },
  "message": "string",
  "pagination_info": {
    "order_by": "id",
    "page_offset": 1,
    "page_size": 25,
    "sort_direction": "ascending",
    "total_pages": 2,
    "total_records": 42
  },
  "status_code": 0,
  "warnings": [
    {
      "field": "string",
      "message": "string",
      "type": "string"
    }
  ]
}
```
</td>
<td>

```json
{
  "data": {},
  "errors": [],
  "message": "Error",
  "status_code": 0
}
```
</td>
<td>

```json
{
  "data": {},
  "errors": [],
  "message": "The server could not verify that you are authorized to access the URL requested",
  "status_code": 401
}
```
</td>
</tr>

</table>

---------

Co-authored-by: nava-platform-bot <[email protected]>
## Summary
Fixes #45 

### Time to review: __3 mins__

## Changes proposed
* update `config.py` database url
* add function in `cli.py` 
* updated packages in `poetry.lock`

## Context for reviewers
N/A

## Additional information
Row created manually in the database alongside a row created via
`test_connection`
![Screen Shot 2024-06-11 at 1 49 53
PM](https://github.com/navapbc/simpler-grants-gov/assets/37313082/b83afad8-5fe1-404f-adf3-c94945740bbe)
## Summary
Fixes #40 

### Time to review: 5 min

## Changes proposed
- Add `pa11y-ci` to run PR checks
   - Tests for each of the pages we have so far
## Summary
Fixes #88

### Time to review: __5 mins__

## Changes proposed
Remove the version field from the docker-compose files

Adjust the docker commands to use `docker compose` instead of
`docker-compose`

## Context for reviewers
A recent version of docker removed the need for specifying a version in
the compose files - so the field is now obsolete and just gives a
warning whenever you run a command:
https://docs.docker.com/compose/compose-file/04-version-and-name/

The change in the command we run is actually from 2021 and makes sure we
use docker compose v2. Depending on how you've setup your local instance
of docker, `docker-compose` may have been aliased to `docker compose`
anyways (I actually think if it wasn't, certain things break anyways).
This is just using the proper format for certain.

## Additional information
I went through running several docker commands and noticed no
difference, as this change shouldn't meaningfully change anything, that
is to be expected.
## Summary
Fixes #73 

### Time to review: 5 min

## Changes proposed
- Generate dynamic sitemap with `/app/sitemap.ts` (next convention)
- Split pa11y config into `pa11y-desktop` and `pa11y-mobile`
## Summary
Fixes #96

## Changes proposed
- Add new id-based opportunity page
- Add new `OpportunityListingAPI` class extended from `BaseAPI`
- Make `searchInputs`/`QueryParamData` in `BaseAPI` and `errors.ts`
optional params (only used for search page)
- Update sitemap to replace [id] in url with 1
- Add test coverage
## Summary
Fixes #{74}

### Time to review: __3 mins__

## Changes proposed
> Added API setup to PA11y
> Updated Pa11y runtime to enable FF for search

## Context for reviewers
> FF requirement was discovered with Ryan while exploring feature. Will
enable the search page for proper testing
> • svg elements with an img role have an alternative text
(https://dequeuniversity.com/rules/axe/4.2/svg-img-alt?application=axeAPI)

   (html)

   <html lang="en"><head><meta charset="utf-8"><me...</html>

• svg elements with an img role have an alternative text
(https://dequeuniversity.com/rules/axe/4.2/svg-img-alt?application=axeAPI)

   (html)

   <html lang="en"><head><meta charset="utf-8"><me...</html>
## Additional information
>Two new errors found now that API results are loading.

![desktop-main-view](https://github.com/navapbc/simpler-grants-gov/assets/29316916/f13859bd-87a7-466d-a20b-e86a9dbe71e5)
…rom search (#54)

## Summary
Fixes #14

### Time to review: __10 mins__

## Changes proposed
Created utilities for creating requests to opensearch, and parsing the
responses into more manageable objects.

Added some logic for configuring how we create indexes to use a
different tokenizer + stemmer as the defaults aren't great.

## Context for reviewers
The search queries we need to create aren't that complex, but they're
pretty large and very nested objects. To help with this, I've built a
few generic utilities for creating the requests by using a builder to
pass in each of the components of the search request. This way when the
API gets built out next, the search logic really is just taking our
requests and passing the details to the factories, which is pretty
trivial.

Responses are a bit less complex, they're just very nested, and adding a
simple wrapper around them in the response helps any usage of the search
client need to do a bit less indexing into dictionaries (eg. to access
the response objects I was doing `values = [record["_source"] for record
in response["hits"]["hits"]]` which is fun). That logic just safely
handles parsing the responses in a very generic manner.

Note that in both cases, there are many cases that we don't handle yet
(a ton of other request params for example), we can add those as needed.
Just focused on the ones we need right now.

---

One unrelated change made it in here and that was adjusting how the
analysis was done on an index. In short, the default
tokenization/stemming of words was clunky for our use case. For example,
the default stemmer treats `king` and `kings` as separate words. I
adjusted the stemmer to use the [snowball
stemmer](https://snowballstem.org/) which seems to work a bit better
although we should definitely investigate this further. I also changed
the tokenization to be on whitespaces as before it would separate on
dashes, and a lot of terms in our system have dashes (opportunity number
and agency pretty prominently).

## Additional information
Since this logic is potentially going to be shared across many
components (if we ever build out more search indexes) - I tried to
document it pretty thoroughly with links to a lot of documentation.
## Summary
Fixes #24

### Time to review: __10 mins__

## Changes proposed
Adds an endpoint to fetch opportunity versions
* Only includes some of the filters that we'll need to include

Adds a lot of utilities for setting up opportunities for local
development and testing with versions

## Context for reviewers

https://docs.google.com/document/d/18oWmjQJKunMKy6cfnfUnyGEX33uu5UDPnRktD_wRFlE/edit#heading=h.4xmkylqq7mnx
provides a lot of context for how opportunity versioning works in the
existing system - which is to say its very very complex. I'm sure we'll
alter that behavior as we go forward, for now I kept the endpoint simple
in terms of its filters, just removing obvious cases (eg. the summary
record is marked as deleted).

I'm also not certain what we want to do with naming. I really don't like
my current approach of "forecast" and "non-forecast", but we can address
that later as well.

--
Beyond understanding what versioning logic we needed to support, the
most complex component by far is setting up the data in the first place
in an easy manner. I originally tried some ideas using the factory
classes themselves, but due to the order of operations necessary to do
that, that wasn't possible (in short, to create prior history records, I
first need the current record, but that doesn't exist until after
everything else in a factory runs). So, I went with a builder process
that wraps the factories and sets up some reasonable scenarios for you.
Its clunky, but seems to work well enough.

---------

Co-authored-by: nava-platform-bot <[email protected]>
### Time to review: __1 mins__

## Changes proposed
Update packages in poetry.lock

## Context for reviewers
Working around a vulnerability
https://github.com/navapbc/simpler-grants-gov/actions/runs/9664332757/job/26659684726
which is because we have a fairly old version of a package in our lock
file.

Skimmed through the package updates, they all seem to be very very minor
version updates as most other packages were fairly up-to-date.

---------

Co-authored-by: nava-platform-bot <[email protected]>
…rint and issue data to database (#84)

## Summary
Fixes #46 

### Time to review: __x mins__

## Changes proposed
* added `sprint-db-data-import` to Makefile
* added `export_json_to_database`
## Context for reviewers

> One strategy would be to keep the make sprint-data-export and
issue-data-export and create make sprint-db-data-import and
issue-data-db-import so that the data is exported to JSON and then
imported into the database.
> 
> A single make command could then be created to run the the export and
then import files.

## Additional information
Sample data in database

<img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM"
src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
## Summary
Fixes #16

### Time to review: __10 mins__

## Changes proposed
Make the v1 search opportunity endpoint connect to the search index and
return results.

Adjust the structure of the response to be more flexible going forward.

## Context for reviewers
The actual building of the search request / parsing the response is
pretty simple. Other than having to map some field names, that logic is
mostly contained in the builder I made in the prior PR. However, there
is a lot of configuration and other API components that had to be
modified as part of this including:
* Adjusting the API response schema (to better support facet counts)
* Piping through the search client + index alias name configuration.
* A monumental amount of test cases to verify everything is connected /
behavior works in a way we expect - note that I did not test relevancy
as that'll break anytime we adjust something.

## Additional information

Note that the change in API schema means the API does not work with the
frontend, but there are a few hacky changes you can make to connect
them. In
[BaseApi.ts](https://github.com/navapbc/simpler-grants-gov/blob/main/frontend/src/app/api/BaseApi.ts#L47)
change the version to `v1`. In
[SearchOpportunityAPI.ts](https://github.com/navapbc/simpler-grants-gov/blob/main/frontend/src/app/api/SearchOpportunityAPI.ts#L56)
add `response.data = response.data.opportunities;` to the end of the
`searchOpportunities` method.

With that, the local frontend will work.

To actually get everything running locally, you can run:
```sh
make db-recreate
make init
make db-seed-local args="--iterations 10"
poetry run flask load-search-data load-opportunity-data
make run-logs
# and also to run the frontend
npm run dev
```
Then go to http://localhost:3000/search

---------

Co-authored-by: nava-platform-bot <[email protected]>
chouinar and others added 11 commits September 9, 2024 15:27
## Summary
Fixes #191

### Time to review: __5 mins__

## Changes proposed
Adjusted a few makefile commands / added some additional docs

Renamed `db-recreate` to `volume-recreate` as the command also recreates
the OpenSearch volume.

Added a command for populating your OpenSearch index locally for
opportunities rather than needing to build the full command yourself.

## Context for reviewers
As we've added more commands to the Makefile, we haven't kept ontop of
how they all interact. This should add some level of clarity to the
commands.

The new `make populate-search-opportunities` command is a much easier
way of writing out the full command of `docker compose run --rm
grants-api poetry run flask load-search-data load-opportunity-data`.

## Additional information
Note that we have a `make help` command which prints out any command
with a `## help text` on the target definition in the makefile:
<img width="1353" alt="Screenshot 2024-09-06 at 4 38 02 PM"
src="https://github.com/user-attachments/assets/d21f25c4-031f-4028-bf6b-d73908215b01">
## Summary
Fixes #164

### Time to review: __10 mins__

## Changes proposed
Adds support to our search query builder layer to handle building
queries for filtering on ints and dates between specific date ranges.

## Context for reviewers
Added ints and dates in the same ticket as the queries are essentially
the same, just ints vs dates. While it throws an exception if both start
and end values are None, I think the API request schema should also do
that so that the error is more relevant/accurate to the API, but I can
add that later (likely a lot more niche edge cases to handle for
requests).

Nothing too exciting with these queries, they work as expected and are
just simple ranges.

## Additional information
The test dataset is roughly accurate (turns out books didn't always have
exact release dates until the last ~20 years).

I also have tested these queries manually with the opportunity endpoints
fields, the following two queries would work (can be tested locally at
http://localhost:5601/app/dev_tools#/console ):
```
GET opportunity-index-alias/_search
{
  "size": 5,
  "query": {
    "bool": {
      "filter": [
          {
            "range": {
              "summary.post_date": {
                "gte": "2020-01-01",
                "lte": "2025-01-01"
              }
            }
          }
        ]
    }
  }
}

GET opportunity-index-alias/_search
{
  "size": 12,
  "query": {
    "bool": {
      "filter": [
          {
            "range": {
              "summary.award_floor": {
                "gte": 1234,
                "lte": 100000
              }
            }
          }
        ]
    }
  }
}
```
## Summary
Fixes #186 

### Time to review: 2 mins

## Changes proposed
> Cleaning up docs around getting onboarded to the project/repo

## Context for reviewers
> I walked through the READMEs and linked documentation to try to get
things up and running on my new laptop without having to go to any
outside sources of information (including you all) without adding that
to the docs.
## Summary
Fixes #125

### Time to review: __10 mins__

## Changes proposed
Added agency tables

Added tgroups tables to the staging & foreign tables

Added factories / data setup for these tables

## Context for reviewers

https://docs.google.com/document/d/1EPZJyqTQruq-BkQoojtrkLqgVl8GrpfsZAIY9T1dEg8/edit
provides a lot of rough details on how this data works in the existing
system

The next ticket will be to transform the data stored in tgroups into the
new agency table(s).

The agency table I created contains most of the fields from the legacy
system (in a normalized structure) including a few that I don't think we
quite need, but just in case they have a separate purpose than what I
can understand, I'm preferring to keep them.

---------

Co-authored-by: nava-platform-bot <[email protected]>
## Summary
Fixes #179

### Time to review: __10 mins__

## Changes proposed
Updated the load search data task to partially support incrementally
loading + deleting records in the search index rather than just fully
remaking it.

Various changes to the search utilities to support this work

## Context for reviewers
Technically this doesn't fully support a true incremental load as it
updates every record rather than just the ones with changes. I think the
logic necessary to detect changes both deserves its own ticket, and may
evolve when we later support indexing files to OpenSearch, so I think it
makes sense to hold off on that for now.
…es (#197)

## Summary
Fixes #178

### Time to review: __10 mins__

## Changes proposed
Adjusted the logic that connects the API requests to the builder in the
search layer to now connect all of the new fields.

A few minor validation adjustments to the API to prevent a few common
mistakes a user could make

## Context for reviewers
The length of the search tests are getting pretty long, I think a good
follow-up would be to split the test file into validation and response
testing.

I adjusted some validation/setup of the API schemas because I don't see
a scenario where min/max OR start/end dates would not ever be needed
together. This also let me add a quick validation rule that a user would
provide at least one of the values.

I adjusted some of the way the search_opportunities file was structured
as we only supported filtering by strings before, and it used the name
of the variables to determine the type. I made it a bit more explicit,
as before random variables could be passed through to the search layer
which seems potentially problematic if not filtered out somewhere.
## Summary
Fixes #106

### Time to review: __10 mins__

## Changes proposed
Add transformations for agency data

## Context for reviewers
Agency data is structured oddly in the existing system, instead of being
in ordinary tables, its in a `tgroups` table that has values stored as
key-value pairs. We want to normalize that into something more workable,
so the transformation needs to work a bit differently than the
transformations of other tables.

For simplicity, I load all of the data for every agency (and later
filter to just what changed) as this removes a lot of weird edge cases
that we would have otherwise needed to consider. Only modified rows
actually get used, but we know we have the full set of data now.

## Additional information
I have a snapshot of the prod tgroups table and loaded it into my DB
locally and ran the transform script. In total, it takes ~2 seconds to
run and didn't hit any issues.

A set of the relevant metrics:
```
total_records_processed=1152
total_records_deleted=0
total_records_inserted=1152
total_records_updated=0
total_error_count=0
agency.total_records_processed=1152
agency.total_records_inserted=1152
TransformAgency_subtask_duration_sec=2.14
task_duration_sec=2.14
```

As a sanity test, I also loaded in the tgroups data from dev and tried
running it through. While it generally worked, there were 12 agencies
that failed because they were missing the ldapGp and AgencyContactCity
fields. I'm not certain if we want to do anything about that as they all
seemed to be test agencies based on the names.

---------

Co-authored-by: nava-platform-bot <[email protected]>
## Summary
Fixes #183 

### Time to review: __x mins__

## Changes proposed
> What was added, updated, or removed in this PR.

## Context for reviewers
> Testing instructions, background context, more in-depth details of the
implementation, and anything else you'd like to call out or ask
reviewers. Explain how the changes were verified.

## Additional information
> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.

---------

Co-authored-by: Aaron Couch <[email protected]>
Co-authored-by: Aaron Couch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants